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Final Report 


Review of “Bay Area/California High-Speed Rail Ridership and Revenue 

Forecasting Study” 

David Brownstone, Mark Hansen and Samer Madanat 
June 30, 2010 


Executive Summary 

We have reviewed the key components of the California High Speed Rail Ridership 
Studies. The primary contractor for these studies, Cambridge Systematics (CS), has 
followed generally accepted professional standards in carrying out the demand modeling 
and analysis. Nevertheless we have found some significant problems that render the key 
demand forecasting models unreliable for policy analysis. This Executive Summary 
describes the most serious problems. The body of this report elaborates on these 
problems and describes additional concerns we have. 

In broad terms, the approach taken by CS includes a model development phase and a 
model validation phase. In the model development phase, both historical data and survey 
data were employed to develop a mathematical model of interregional travel. The 
individuals surveyed were interregional trip makers. However, the mode choices of the 
individuals surveyed were not representative of California interregional travelers. For 
example, nearly 90% of long distance (over 100-mile) business passenger trips are made 
by car, while 78% of the long distance business travelers sampled for the study were 
traveling by air. 

The travelers in the sample were asked a series of questions concerning the mode choices 
they would make for the interregional trip that they were making at the time they were 
surveyed, under various hypothetical values of travel time, cost, service frequency, and 
service reliability for each modal alternative (auto, air, conventional rail, and high speed 
rail). In analyzing the data, the fact that the mode shares actually used by the travelers 
were not representative of traveler population was not taken into account. Since it is 
likely that travelers on different modes attach different degrees of importance to different 
service attributes (e.g. air travelers care more about travel time than auto travelers), it is 
likely that the resulting model gives a distorted view of the tastes of the average 
California traveler. 

CS attempted to adjust for this problem in the validation phase, by making sure that the 
model accurately replicated the observed market shares for the existing travel modes in 
the year 2000. Model predictions of trips by mode were compared with observed values. 
Parameters obtained in the model development phase were adjusted in order to obtain 
good agreement between predicted and observed values. 
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Unfortunately, the methodology employed by CS for adjusting the model parameters has 
been shown to be incorrect for the type of model they employed. The parameters are 
therefore invalid and the forecasts based on them, in particular of high speed rail mode 
shares, are unreliable. (It should be noted that at the time CS performed the study the 
incorrectness of their adjustment method was not known.) 

We found other problems in model development and validation. CS changed key 
parameter values after the model development phase because the resulting estimates did 
not accord with the modelers’ a priori expectations. While this is frequently done in this 
type of work, it is important that the a-priori expectations be based on experience with 
like contexts. Unfortunately, some of the a-priori expectations used by CS are valid for 
intra-regional, but not for inter-regional ridership models. Specifically, the modelers 
increased the parameter for headway (the time between successive aircraft or train 
departures) and set it to a value typically found in intra-regional travel demand models. 
This adjustment made the predicted shares of the travel modes very sensitive to changes 
in frequency. 

Another problem was that CS employed a model structure that does not allow for 
travelers to choose between high speed rail stations, thereby exaggerating the importance 
of having frequent service at the single station that is judged to be “best” for a given trip. 
Together with the inflated value of the headway parameter described above, this 
unrealistically favors alignments that avoid dividing services onto branch routes, such as 
Pacheco. Correcting this deficiency would almost certainly reduce, although probably not 
eliminate, the ridership difference between the Pacheco and Altamont alignments found 
in the CS study. 

In the model validation phase, several other parameters of the mathematical model were 
adjusted. As a result of this process, many of the model parameters were assigned values 
that were considerably different from those obtained in the model development phase. In 
some instances changes to the model parameters were informed by professional 
judgments of the consulting team as well as the goal of replicating observed behavior. 

The resulting “validated” model, which is used to generate subsequent high speed rail 
ridership forecasts, provides reasonably accurate “backcasts” for the year 2000, reflects 
certain patterns of behavior observed in the traveler surveys, and accords with 
professional judgments of the consultant. However, the combination of problems in the 
development phase and subsequent changes made to model parameters in the validation 
phase implies that the forecasts of high speed rail demand—and hence of the profitability 
of the proposed high speed rail system—have very large error bounds. These bounds, 
which were not quantified by CS, may be large enough to include the possibility that the 
California HSR may achieve healthy profits and the possibility that it may incur 
significant revenue shortfalls. We believe that further work to both assess and reduce 
these bounds should be a high priority. 
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Final Report 


Review of “Bay Area/California High-Speed Rail Ridership and Revenue 

Forecasting Study” 

David Brownstone, Mark Hansen and Samer Madanat 
June 30, 2010 


Introduction 

This report represents the deliverable for the project titled “Peer Review of California 
High Speed Rail Ridership Forecasting Studies”. This project was initiated at the request 
of the State Senate Committee on Transportation and Housing, and was funded by the 
California High Speed Rail Authority (CA HSRA). The scope of the project was to 
review the Bay Area/California High-Speed Rail Ridership and Revenue Forecasting 
Study conducted by Cambridge Systematics for the Metropolitan Transportation 
Commission (MTC). The project team was to evaluate the technical quality of the 
demand model system development including, but not limited to, data sets (sample sizes 
and strategies, data sources and their combination), model structure, model specification, 
parameter estimation results and model validation. 

The project involved the following tasks: 

1. Review of relevant reports and documents 

2. Development of a list of questions based on the review; these questions were 
addressed to Cambridge Systematics 

3. Review of responses from Cambridge Systematics, submission of additional 
questions and review of the responses to these additional questions 

4. Preparation of a draft final report 

5. Preparation of a final report, which includes the CA HSRA’s comments on the 
draft final report. 

This final report includes three sections, including the introduction. The next section 
describes our technical evaluation of the HSR ridership model. The third section 
summarizes our assessment of the reliability and accuracy of the model. 

Appendix A includes our questions to Cambridge Systematics (the output of task 2) and 
the consultants’ responses to our questions. Appendix B includes additional questions 
that were submitted to CS after our review of these responses (the output of task 3), and 
Cambridge Systematics’ responses to these additional questions. Appendix C includes 
the CA HSRA’s comments on our draft report, and Appendix D includes our responses to 
the comments included in Appendix C. 
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Technical Evaluation 


We very much appreciate Cambridge Systematics’ prompt and thorough responses to our 
questions (see Appendixes A and B). These responses have clarified issues that were not 
clearly explained in the reports. We are, for the most part, satisfied with their responses 
and agree that their work on this project meets generally accepted standards for travel 
demand modeling. We are, however, concerned about the impact of some of Cambridge 
Systematics’ modeling decisions on the reliability of the forecasts based upon these 
models. 

Our review of the ridership models and the responses provided by Cambridge 
Systematics has led us to the identification of a number of problems that will affect the 
accuracy and reliability of these models. These problems are listed below. 

1. Arbitrary division of trips into long and short trips 

Each model is segmented on short and long trips and these are estimated separately. The 
sharp delineations between different trip categories seem arbitrary and unnecessary. It 
should have been possible to formulate models that include dependencies on trip length 
(e.g., with interaction terms). 

This segmentation causes problems on the long/short trip boundary (100 miles) as can 
easily be seen in Figure 3.4 in the Model System Development Final report (this figure 
relates to the destination choice models). The “net effect” of distance for short business 
trips declines from 0 to -4 at 100 miles, and then jumps back up to -1.7 before declining 
again and finally rising starting at 250 miles. Similar behavior occurs for the other trip 
types. This discontinuity will almost certainly cause strange forecasting behavior for trips 
close to 100 miles. 

Cambridge Systematics’ response to this comment is that the distance coefficient cannot 
be studied in isolation of other explanatory variables. We agree that travel time, and 
other variables, will vary with distance, and travel time enters in the model through the 
logsums from the mode choice models (and thus one cannot hold travel time constant 
while varying distance). But accounting for the effect of travel time and other variables 
will not eliminate the discontinuity in the effect of distance, and the problem will thus 
remain. 

2. Assigning all business trips to the peak period 

On Page 3-2 of the "model system development" report, the market segments are defined. 
According to this definition, all business or commuting trips were assigned to peak 
conditions. This is potentially a serious problem. 

Cambridge Systematics’ response to this comment is that they have followed common 
practice in regional travel demand modeling. While we don’t doubt that this is the case, 
this fact is irrelevant to the subject, which is an interregional travel demand model. For 
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interregional travel, quite a few business trips are made in the off-peak periods. For 
example, the California Travel survey shows over 25% of business trips in the off-peak 
periods. 


The effect of this assignment is that a measurement error is added to the level of service 
attributes for off-peak business travelers. Even if this measurement error is totally 
random it will lead to biased and inconsistent parameter estimates. This measurement 
error is in fact not random - off-peak business travelers face better levels of service for 
auto and worse levels of service for other modes. 

3. Incorrect treatment of the panel data set in the main mode choice model 

A concern with the Stated Preference (SP) data used for estimation is that each 
respondent provided responses to several hypothetical choice situations. Thus, there is 
likely unobserved serial correlation between the choices made by each respondent. It 
appears that Cambridge Systematics treated each SP response as independent and thus 
ignored this likely serial correlation. Under strong assumptions about the sources of this 
serial correlation (independent of everything else in the model) the parameter estimates 
are still consistent (though, for reasons described under point 6 below, the parameter 
estimates of this model are in fact not consistent). However the standard errors of these 
parameters are downward biased. This downward bias causes the reported t-statistics to 
be inflated and this in turn leads to overstated statistical significance of the model 
parameters. 

More realistic assumptions about how respondents treat multiple SP choices lead to more 
complex choice models. Recent work has focused on preference heterogeneity as a likely 
source of unobserved correlations across repeated choices. Hensher 4 and Brownstone et 
al 5 show that ignoring this preference heterogeneity can bias value of time measures by 
as much as 100 per cent and associated t-statistics by as much as 100 per cent. 

4, Constraining the headway coefficient in the main mode choice model 

The fourth bullet on page 3-36 in the Model System Development Final report explains 
that the headway coefficient estimate was about 20% as large as that for in-vehicle- 
travel-time. The authors of the report consider this result inadequate, expecting the two 
coefficients to be about equal (under the assumption of waiting time being equal to half 
the headway, and waiting time being about twice as onerous as in-vehicle travel time). 
Thus, in the final models, the headway coefficient was constrained by the modelers to be 
equal to the in-vehicle travel time. 

The modelers’ expectation would be reasonable if this was an urban travel demand model, 
but it is incorrect in the present context. This is because, in inter-city travel where 


4 Hensher, D.A. (2001). The sensitivity of the valuation of travel time savings to the specification of the 
unobserved effects. Transportation Research E, 37, pp. 139 - 142. 

5 Brownstone, D., D.S. Bunch, and K. Train (2000). Joint mixed logit models of stated and revealed 
preferences for alternative-fueled vehicles. Transportation Research B, 34, pp. 315-338. 
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headways are longer, passengers do not arrive randomly at the stations or airports, but 
rather according to the vehicles’ schedule, which implies that the waiting time cannot be 
assumed to be equal to half the headway. There is therefore no reason to expect the 
coefficients on travel time and waiting time to be equal. Indeed, a recent study by Adler, 
for example, concluded that air travelers place 4-5 times as much weight on travel time as 
on the time difference between their ideal departure time and when service is available. 6 

It has been argued that if service headways are sufficiently low, high speed rail travelers 
may indeed use the system in a manner similar to some urban transit riders, arriving at 
stations randomly and waiting for the next trains. For such travelers, constraining the 
waiting time coefficient to equal that for travel time may be appropriate. It is clearly 
inappropriate for air travelers, however. 

5. Absence of an airport/station choice model 

The modeling approach does not explicitly consider the choice of airport or station. 

Given the trip origin and destination, the model determines the airport or stations that 
would be used for access and egress “taking into account the level of service of the access 
and egress modes and the frequency of service at each station and airport ....” (See 
Appendix A). Such “all-or-nothing” assignment is behaviorally unrealistic since, 
depending on their desired travel schedule, access/egress modes and other factors, 
travelers may choose different stations. 

The failure to consider station choice is important when comparing ridership for the 
Altamont and Pacheco corridors. In the Altamont alternative, trips between South Bay 
locations and Southern California must be exclusively assigned to either a station on the 
San Jose or the San Francisco lines, and the frequency of service will be accordingly 
reduced. In reality, travelers in these markets could choose either line, depending on 
which is most convenient for their particular travel schedule. Failure to consider this 
possibility means that the inconvenience of the split schedule assumed for the Altamont 
alternative is exaggerated. The problem is compounded by the inflated value of the 
headway coefficient, as pointed out earlier. Correcting this deficiency would reduce, 
although probably not eliminate, the projected ridership difference between the two 
alignments. 

6. Incorrect calibration of the alternative-specific constants in the mode choice models 

Some models in the model system were estimated using SP data or merged SP data with 
Revealed Preference (RP) data from the Caltrans Travel Survey and various Metropolitan 
Planning Organizations (MPO) surveys. The SP data sets and any merged data sets that 
used the SP data are choice based with respect to at least some of the models, so 


6 Thomas Adler et al, Modeling Service Trade-Offs in Air Itinerary Choices, Transportation Research 
Record 1915 (2005), pp. 20-26. 
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unweighted maximum likelihood estimation will lead to inconsistent parameter estimates 
and inconsistent inference. The only exception to this general result is a Multinomial 
Logit Model with a full set of alternative specific constants, and here the parameter 
inconsistency is limited to the coefficients of these alternative specific constants. This 
inconsistency can be solved by calibrating these alternative specific constants to external 
data on mode shares, and this is what the modelers did in the model validation. 

However, the Main Mode Choice model and the Access/Egress Mode Choice model are 
not Multinomial Logit, but rather Nested Logit models, for which the calibration of the 
alternative specific constants to external data does not solve the inconsistency problem. 
Cambridge Systematics appears to have relied on Koppehnan and Garrow 7 * to justify their 
estimation and calibration procedure. Unfortunately, more recent work by Bierlaire, 
Bolduc and McFadden shows both theoretically and with real data examples that the 
Koppehnan and Garrow procedure is wrong. One of their examples was based on a study 
of a proposed high-speed rail line in Switzerland, and for these data they found that the 
key coefficient on the high-speed train travel time was downward biased by a factor of 
two. 

Therefore, the calibration process followed by the modelers does not eliminate the 
parameter biases introduced by using unweighted maximum likelihood estimation with 
their choice-based SP samples. More specifically, all the parameters (including the key 
level of service, cost, and time parameters) of the Main Mode Choice models and the 
Access/Egress Mode Choice models are biased and inconsistent. 

Note that the SP samples are not strictly choice-based, since they are stratified on the 
mode chosen for an actual trip. However, since any unobservable effects that caused the 
respondents to choose a particular mode for an actual trip are likely to influence their 
choices of hypothetical trip modes, the SP samples are clearly endogenous. As Cosslett 
and Wooldridge 9 show, the same problems that occur with choice-based samples occur 
with more general endogenous samples. Even in the highly unlikely event that the 
respondent’s SP choices are independent of their actual trip choices, simply changing the 
alternative specific constants to deal with non-representative samples is not an 
appropriate way to generate consistent forecasts. It is simple to test for this strong 
independence assumption by simply including variables indicating the respondents’ 
actual mode choices in the models explaining their SP choices. As reported on page 3 - 
33 of the final model development report, when this was tried these actual choice 
variables were significant and led to large changes in the other model coefficients. This 
shows that respondent’s SP choices are not independent of their actual choices, and thus 
the SP sample is clearly endogenous. 


7 F.S. Koppelman and L.A. Garrow (2005). Efficiently Estimating Nested Logit Models with Choice-based 
Samples. Transportation Research Record, 1921, pp. 63-69. 

s Bierlaire, M., D. Bolduc, and D. McFadden (2008). The Estimation of Generalized Extreme Value 

Models from Choice-based Samples. Transportation Research B, 42, pp. 381-394. 

9 

Cosslett, S.R. (1993). “Estimation from Endogenously Stratified Samples,” in Flandbook of Statistics, 
Volume 11, ed. by G. S. Maddala, C. R. Rao, and H. D. Vinod. Amsterdam: North-Holland, pp.1-43. 
Wooldridge, J.M. (1999). “Asymptotic Properties of Weighted M-Estimators for Variable Probability 
Samples. Econometrica, 67(6), pp. 1385-1406. 
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7. Excessive constraining of coefficients in the final models 


The final results of the model validation and calibration process were made available by 
Cambridge Systematics, in a memo sent by George Mazur to Nick Brand, dated January 
29, 2010, with the subject line: “Final Coefficients and Constraints in HSR Ridership and 
Revenue Model”. 

In addition to constraining the alternative-specific constants, location constants and 
airport-specific constants (in an attempt to eliminate the bias introduced by using choice- 
based samples), the modelers constrained some accessibility/logsum variables in all 
models and two critical Level-Of-Service (LOS) variables in the main mode choice 
model. 

Constraining model parameters (other than for correcting for choice-based sampling) 
requires strong justification. In the absence of such justifications, constraints represent 
the modelers’ professional judgment. It is sometimes necessary to use judgment to 
override estimation results when these results are clearly unrealistic. The problem with 
simply constraining parameters as a way of imposing professional judgment is that there 
is no way for external reviewers to know whether the final model results are primarily 
due to judgment or based on the data used to estimate the models. The only thing that is 
certain is that the standard errors of the estimated coefficients are biased downwards 
when other coefficient values are constrained. 

Cambridge Systematics did not perform an analysis of the forecast error bounds of the 
model system. These error bounds should reflect both the effect of uncertainty in 
predicting the explanatory variables and the standard errors in the parameter estimates. 

As we have indicated in this report, these standard errors are understated (cf. points 3 and 
7). This leads us to conclude that the true error bounds of the forecasts are likely very 
wide. 


Conclusions 

It is not possible to predict exactly the net effect of the modeling problems that we have 
identified. However, we can say that the sheer number of constraints placed on the 
model parameters makes this model system unreliable for predicting the shares of the 
competing modes (air, train, automobile and HSR) in the California HSR corridor. While 
there are advanced econometric techniques that could alleviate some of the biases we 
have found, it is likely that application of these techniques would lead to larger standard 
errors than are reported. As we have noted earlier these reported standard errors are 
already biased downward. 

Our main conclusion is that the true confidence bands around the estimates from these 
models must be very wide. They are probably wide enough to include demand scenarios 
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where HSR will lose substantial amounts of money as well as those where it will make a 
healthy profit. 


We believe that the priority, if additional work is performed, should be to accurately 
quantify the error bounds associated with the model system forecasts, and to attempt to 
reduce these bounds. 
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June 8, 2010 
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Tronsporfatlon leadership you can trust. 


Memorandum 


TO: 

FROM: 

DATE: 

RE: 


Samer Madanat 

Lance Neumann, Chief Executive Officer 
June 8, 2010 

Responses to UC Berkeley’s May 17, 2010 Questions 



We welcome the opportunity to respond to your questions regarding a very complex model and 
model development project. We recognize the importance of your review and our response, 
given the reliance on the model by the California High-Speed Rail Authority. This also is an 
opportunity for Cambridge Systematics (CS) to speak to some of the misunderstandings and 
misinterpretations of our work by others over tire past six months. We believe that our res¬ 
ponses demonstrate that we engaged in a carefully thought out model development and appli¬ 
cation process that resulted in a professionally developed, state-of-the-practice model. 

We understand that the focus of your review so far has been on the Interregional Model System 
Development Final Report. It also is our understanding that the scope of your review includes tire 
underlying contract for the Metropolitan Transportation Commission (MTC) study, work plans, 
and related reports, including peer review reports. Some of the questions and concerns raised 
in your memorandum are addressed in this other documentation that CS and the High-Speed 
Rail Authority (HSRA) provided to your team. Our responses include appropriate references to 
direct you to those other project reports, which are important to a complete documentation of 
the modeling work. 

Questions in your memorandum are grounded in travel demand forecasting theory. We believe 
that you also appreciate, as we do, that strict adherence to theory typically is neither feasible nor 
desirable in practice. This tension between theory and practice poses challenging tradeoffs as 
one develops a ciedible model for real world application, and we anticipate that you will com¬ 
municate this in your report. 

For the development of this model, a peer review panel was formed and engaged to help guide 
these tradeoffs prior to and during data collection and model development. While the peer 
review panel s comments were advisory in nature, their active engagement and advice to the 
MTC, the HSRA, and the consulting team helped assure that the resulting ridership and reve¬ 
nue model met the project objectives. Those objectives included: 


555 12th Street, Suite 1600 
Oakland, CA 94607 
www.camsys.com 


tel 510-873-8700 


fax 51 0-873-8701 





» A travel modeling system for examining high-speed rail alternatives in California, in 
particular, from the San Joaquin Valley to the San Francisco Bay Area; 

® Appropriate for preparing ridership and revenue forecasts and other measures such as user 
benefits, travel time and travel cost savings for new riders, and impacts on other modes; 

® Intended primarily for use in further detailed environmental analysis work to be conducted 
by the HSRA; and, 

® Employing a network-based modeling system using commercially available modeling soft¬ 
ware in use at California’s metropolitan planning organizations and Caltrans. 

In our responses, we have pointed out how we bridged the gap between the conflicting 
demands of modeling theory and practical model development and application to achieve these 
project objectives. 

It is the firm belief of the model development and application teams and the HSRA that this 
ridership and revenue model achieves the project objectives, and that the model development 
process followed accepted model development practices for statewide travel modeling 
engagements. This model was carefully developed by experienced leaders in travel modeling, 
and reflects experience that CS has gained in nearly 40 years of worldwide travel demand 
model development and application. 

Our response to your questions was directed by George Mazur, the CS project manager for our 
ongoing work with the HSRA, in collaboration with Kimon Proussaloglou, David Kurth, Mark 
Bradley, and Maren Outwater. These individuals have been involved in the modeling work. 

Thank you for the opportunity to respond to your questions. We anticipate that a careful 
review of the attached responses and related project documentation will allay your concerns. 
CS staff is available to respond to any additional questions you may have, and I encourage you 
to contact us to support your effort to provide a fully informed, objective, unbiased review of 
the ridership and revenue forecasting model developed for this project. 



Attachment to June 8,2010 Memorandum 


The original questions posed by the UC Berkeley team are listed in italics prior to providing a 
response. In many cases, we provide additional background information to explain the ratio¬ 
nale for the adopted approach to survey and sample design, data collection, and model estima¬ 
tion and validation. Where appropriate, we also provide references to the literature and to 
documents that describe the data collection and modeling aspects of this effort. 

The approach to data collection was shaped by focusing more on those interregional travel 
markets which the proposed high-speed rail service would serve. The corresponding data plan 
was developed through discussions with the Metropolitan Transportation Commission (MTC), 
the High-Speed Rail Authority (HSRA), and the peer review panel. 

The approach to model estimation was to use best practice approaches to develop a robust dis¬ 
aggregate model system for the entire State that distinguishes trips by purpose, distance, cur¬ 
rent mode used, and geographic markets of interest. 

The objective of model validation was to properly determine the size of the total travel market 
and the magnitude of travel flows across modes and origin-destination markets. To support 
model validation for the base year conditions, traffic count data on observed highway volumes, 
flows of air travel by O-D pair, and intercity rail ridership were analyzed. 



Parti. Data 


The models developed by Cambridge Systematics for forecasting HSR demand depend on a wide 
variety of data sources. We will concentrate on the key data sources used for model estima¬ 
tion. As best as we can tell, the target populations of these data are residents of California. 
This is problematic because California is among the most popular tourist destinations in the 
world, and visitors are disproportionately more likely to use HSR than residents (if for no 
other reason than they have no private cars available). Some of the calibration (notably the 
TAA ticket sample) includes visitors, so the calibration procedures probably scaled up air 
demand to include them. Was there a similar “scaling up”for users of other modes? 

The interregional component of the high-speed rail ridership and revenue model includes in¬ 
state travel by both California residents and nonresidents (i.e., visitors). All of the calibration 
and validation datasets (including air, auto, and conventional rail) included in-state resident 
and nonresident travel. 

This comment correctly points out that the target population for the various surveys was 
California residents, and that the resulting model “scales up” resident travel to account for both 
resident and nonresident travel. This scaling process is consistent with peer review guidance 
on the inclusion of visitor travel. 1 

The adopted approach and the resulting validated model properly size up the total interre¬ 
gional travel market. The resulting model also accounts for the magnitude of travel flows 
across all modes and in different geographic origin-destination markets. 

Scaling is a reasonable approach given the low proportion of interregional nonresident travel 
that occurs in California. Although there is no direct information on interregional nonresident 
travel within California, data collected in Florida illuminate this issue. It is estimated that non¬ 
residents account for 8 percent of vehicle travel within Florida. This share has dropped consis¬ 
tently since the early 1990s. 2 

Since Florida draws more domestic 3 ' 4 and international visitors than California 5 , yet has half the 
resident population of California 6 , it is reasonable to infer that nonresidents account for a much 
smaller share of total vehicle travel in California than reported in Florida. 


1 Cambridge Systematics, Inc. (July 2006), Bay Area/California High-Speed Rail Ridership and Revenue 
Forecasting Study: Findings from Second Peer Review Panel Meeting, Draft Report, p. 2-2. 

2 Florida Department of Transportation, Office of Policy Planning (November 2008), Trends and 
Conditions Report - 2008, Table 3. 

3 Ibid, p. 3, Table 1 and Figure 4. 

4 D.K. Shifflet and Associates (June 2009), California 2008 Data Tables Public Version, p. 8. 

5 U.S. Department of Commerce, Office of Travel and Tourism Industries (May 2010), Overseas Visitation 
Estimates for U.S. States, Cities and Census Regions: 2009, p. 4. 

6 U.S. Census Bureau; State and County Quick Facts ; http://quickfacts.census.gov/qfd/index.html . 
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The statement “visitors are disproportionately more likely to use HSR than residents (if for no other 
reason than they have no private cars available) ” is debatable. We are not aware of any research or 
published material that validates the hypothesis that visitors will use HSR at a higher rate than 
residents. Further, published reports show that 43 percent of domestic visitors arrive in 
California in their own private vehicle 7 , and 68 percent of overseas visitors to California rent a 
car or have other access to a private auto at some point during their visit 8 . 

However, even if the hypothesis stated in the comment is true, then the current ridership and 
revenue model provides a conservative forecast of high-speed rail’s ridership potential from the 
nonresident market. 

In summary, we disagree with the characterization of the sampling frame and the scaling 
process as “problematic”. Travel by nonresidents is a small portion of total in-state travel, and 
nonresidents have access to private autos and rental cars. The scaling process used in this 
model is a reasonable and appropriate approach to scale up to the total size of the travel market, 
and to reflect the distribution of flows in the State across modes and geographic origin-destination 
markets. 

The NuStats household survey appears to consider only trips of 24-hour duration or less. Is 
that true? 

No, this is not true. 

The California statewide household travel survey collected data over a 24-hour period and 
included trips that began on the specified day for the diary (3:00 a.m. to 3:00 a.m.), regardless of 
whether they lasted 24 hours or more. 

The trips that lasted more than 24 hours were reported as a trip that began on that day, but did 
not return to the trip origin during the 24-hour period. A “trip,” as referenced in the reports 
and used in the travel model, refers to an one-way trip; not a round trip or a tour. 

The SP surveys designed for the Cambridge Systetnatics project also may have other problems. 
We do not know ivhy there was no sampling at any Los Angeles area airport, but this exclusion 
could lead to biases. 

We take exception to the blanket statement that the SP surveys “may have other problems 

With respect to the specific concern about sampling at LA area airports - we were precluded 
from sampling at these airports and took steps to mitigate the potential bias this preclusion 
might have caused. 

Extensive discussion of the sampling plan occurred between the consultant team, MTC, HSRA, 
and the peer review panel. The original plan was to conduct air intercept surveys at one or 
more airports in the Los Angeles basin. However, none of the LA airports contacted would 
agree to participate in the survey. 

7 D.K. Shifflet and Associates (June 2009). California 2008 Data Tables Public Version. Table 18. 

8 CIC Research, Inc. (July 2009). Overseas and Mexican Visitors to California 2008. p. IX. 
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To mitigate the impact of the lack of access to LA area airports, the survey sampling approach 
was modified to ensure that a good selection of trips to/from LA airports was obtained. This 
objective was accomplished by: 

1. Adding Fresno as a survey site; 

2. Sampling flights at other airports to reflect the share of flights observed to each of the LA 
airports; and 

3. Expanding the time of day during which surveys would be conducted. 

The air intercept survey was conducted during morning, midday, and evening hours to ensure 
a mixture of outbound and return trips. For example, interviewing travelers at SFO allowed us 
to capture both SF area residents who were making their outbound trip to LA Basin airports, 
and LA Basin residents who were making their return trip back to LA. 

Question 7 in the air intercept survey asked for a traveler’s home zip code 9 . Of the 1,016 air 
intercept surveys that had a valid California home zip code, the distribution of these home zip 


codes was as follows: 


• MTC region 

26% 

• SACOG region 

17% 

• SAND AG region 

14% 

• SCAG region 

35% 

• San Joaquin Valley 

4% 

• Elsewhere in California 

4% 

As shown by these data, the survey sample included a substantial proportion of households in 


the SCAG region (which includes the Los Angeles Basin), even though intercept surveys were 
not allowed at airports in the LA Basin. 

In summary, we believe that the sampling plan for the air intercept survey avoids the biases 
that concern you, since it properly addresses all key origin-destination air travel markets of 
interest. 

The airport SP screening questions also excluded those whose trip origin was in California but 
were connecting to a flight in California to a final destination outside California. Was this 
possibility accounted for? 

Yes, the possibility of including connecting air travel was considered, but was excluded from 
the model design for two primary reasons. 


9 Corey, Canapary & Galanis Research (December 2005), High-Speed Rail Study Survey Documentation, p. 2 
and p. 1 of example survey form. 
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First, the diversion of connecting air travelers was included in prior modeling efforts sponsored 
by the HSRA. This market segment was found to be very small accounting for less than one 
percent of HSR’s ridership and revenue potential 10 . 

Second, surveying connecting passengers would have required separate versions of the stated- 
preference questionnaire. Connecting passengers would have had to be treated as a separate 
market segment with concomitant survey size requirements. This, in turn, would have sub¬ 
stantially increased survey costs. 

In summary, the combination of limited market potential and high survey costs led to a decision 
not to include connecting air travel in the model design 11 . 

The automobile SP survey also includes some problematic choices. The screening question 
excludes everybody except those households that took at least one interregional auto trip in the 
HSR service area. It would have been useful to at least collect information on all trips (and 
their modes) in the HSR service area. This would have provided an additional source of ran¬ 
dom sample information that could be used to check problems caused by the choice-based 
nature of the primary data sets. Is there any reason why this was not done? 

We disagree with the statements that the automobile SP survey “also includes some problematic 
choices” and that the sampling plan may introduce “problems caused by the choice-based nature of 
the primary data sets ”. 

The primary purpose for collecting new surveys as part of the model development project was 
to gather stated-preference information to estimate mode choice models that could be used to 
forecast HSR’s ridership potential 12 . This focus was identified very early in the project. 
Although multiple sources of pre-existing revealed-preference data were available, no prior 
stated-preference database existed to explore the potential of shifting to HSR. 

The revealed-preference portion of the new surveys was collected to: 

1. Supplement existing Caltrans and MPO survey datasets to estimate or validate trip fre¬ 
quency and destination choice models; and 

2. Provide inputs to develop customized choice exercises for each respondent in the stated- 
preference survey. 

“Information on all trips”: The survey design and sampling plan were discussed at length. An 
early draft included questions on all trips and modes taken. However, the additional questions 
were quite lengthy and burdensome to add to telephone and intercept surveys that were 
already quite long. The merits of different survey elements and sample sizes were debated at 


10 Charles River Associates Inc. (January 2000), Independent Ridership and Passenger Revenue Projections for 
High-Speed Rail Alternatives in California, Draft Final Report, Table 4-7. 

11 Cambridge Systematics, Inc. (May 2005), Bay Area/California High-Speed Rail Ridership and Revenue 
Forecasting Study: Model Design, Data Collection and Performance Measures, Table 3.2 and p 3-8. 

12 Ibid; p. 3-1. 


- 5 - 



length during the first peer review meeting. In the end, there was unanimous agreement that 
resource allocation decisions should favor expansion of the survey sample size. 13 This agree¬ 
ment led to forgoing other potential survey ideas, such as a retrospective trip frequency survey. 

“Additional source of random sample information These additional data are not needed for 
screening, model estimation, or for model validation and application. The representation of 
some segments in a greater proportion than their true incidence in the population due to choice- 
based sampling is taken into account and explicitly controlled for during the model develop¬ 
ment process in the following ways: 

• Screening. The focus of the SP survey was on the most relevant trips for the study and cor¬ 
ridors of interest. The objective of this survey is to make respondents think about a recent 
trip and put themselves in the position of comparing their current travel options against a 
well defined alternative which does not currently exist. 

• Model Estimation. The choice-based nature of the primary dataset does not cause problems 
either in model estimation or in model application. Estimation using random and choice- 
based samples produces efficient (best linear) and unbiased level of service and cost 
coefficients. 

• Model Validation/Application. Choice-based sampling only affects the size of the modal 
constants. This simply reflects the fact that some market segments are represented in 
greater proportion than their true incidence in the population. Modal constants are adjusted 
during model validation using well accepted methods to eliminate any biases. 14 ' 15 ' 16 

In summary, the use of choice-based sampling does not introduce any problems. Instead, it 
provides a very powerful approach to focus on members of one or more small market segments 
of interest to this study. A larger sample size from these market segments allows us to analyze 
their behavior to better understand and quantify the determinants of both their current and 
future travel choice behavior. 


13 Cambridge Systematics, Inc. (July 2005), Bay Area/California High-Speed Rail Ridership and Revenue 
Forecasting Study: Findings from First Peer Review Panel Meeting, Draft Report, p. 4-3. 

14 Manski, C. F., and S. R. Lerman, (1977), The Estimation of Choice Probabilities from Choice-Based 
Samples, Econometrica 45(8): 1977-88. 

15 Ben-Akiva, M. E., and S. R. Lerman, (1985), Discrete Choice Analysis: Theory and Application to Travel 
Demand, MIT Press, Cambridge, Massachusetts. 

16 Koppelman, F. S., and L. A. Garrow, (2005), Efficiently Estimating Nested Logit Models with Choice- 
Based Samples: Example Applications, Transportation Research Record 1921 : pp. 63-69. 
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The SP tasks only vary fuel costs for the auto alternatives and single fares for air and HSR. It 
is not clear from the documentation hoiv respondents were supposed to treat trips with mul¬ 
tiple passengers. It would have been much better to be explicit about this when building the 
customized SP survey task. 

The choice experiment included clear instructions for how respondents should consider party 
size. Both the choice experiment and the estimated models accounted for differences in cost 
considerations between traveling alone and traveling in groups with multiple passengers. 

Key aspects of the choice experiment include the following: 

• The experiment was customized to each respondent’s recent travel and used as a basis the 
same reference trip described by the respondent including party size. 

• The difference between driving alone and carpooling with other passengers was explicitly 
mentioned in the introduction to the stated-preference survey 17 . (“On the trip you described 

to us, you indicated there were a total of_persons (including yourself) traveling in 

your party”.) 

• Comparable and consistent costs across all modes were displayed so that the respondent 
could make a stated choice for the same reference trip. 

• It is reasonable to assume that respondents who travel with other passengers implicitly 
make the calculation for the cost of the entire reference trip taking into account the size of 
the traveling party. 

• Such an approach is consistent with their observed behavior for the same reference trip that 
was the basis of the stated-preference experiment. 

• In the instructions to the choice experiments, respondents were repeatedly reminded to con¬ 
sider their reference trip as the basis for their choices: “Remember we are interested in the 

specific trip we have been discussing (the one you took from your home to_ 

(destination city)”. Finally, the level of customization was very high with hundreds of dif¬ 
ferent versions of the stated-preference questionnaires reflecting about 30 different origin- 
destination segments and 16 different blocks for each segment. 

In summary, respondents were presented with, and properly accounted for, the impact of party 
size on travel costs. During model estimation, we in turn uncovered important differences in 
travel behavior between those traveling alone and those traveling in groups. These differences 
were important enough to warrant the extra step of segmenting the long-distance travel market 
by party size to properly reflect differences in tradeoffs and travel behavior. 


17 Corey, Canapary & Galanis Research (December 2005), High-Speed Rail Study Survey Documentation, 

p. 62 . 
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It would also have been better to include the underlying gasoline price in the presentation of 
the auto alternatives since most people think about fuel costs in terms of gasoline price and 
they may not have a good idea of the mileage involved in an infrequent auto trip. Was this 
considered? 

We do not believe that including the underlying gasoline price in the choice exercises would 
have improved the realism of the choice experiment, and therefore the quality of the responses 
obtained. On the contrary, presenting the price of fuel instead of the cost of the trip by auto 
could have introduced biases in the responses to the survey. The reasons for our approach are 
as follows: 

• The experiment presented all competing modes on an equal footing using comparable total 
one-way costs and level of service characteristics for each mode; 

• In transportation-related choice experiments, it is customary practice to present the costs for 
each travel option using an equivalent monetary basis (i.e., the total cost to travel from 
point A to point B); 

• It is possible that using the gasoline price may have increased respondent burden and could 
have confused some respondents; 

• Presenting fuel price alone would have required respondents to calculate the total auto tra¬ 
vel cost based on mileage and fuel economy; and 

• Instead of asking respondents to perform a comparison of a generalized auto fuel cost to 
specific air and rail fares, the survey included the calculation and presented only the total 
fuel cost to simplify the choice task. 

In summary, we believe that presenting only the per gallon fuel price would increase the bur¬ 
den on respondents, introduce the risk of miscalculations of relative costs across modes, and 
potentially could have led to bias in the survey responses. 
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Part 2: Model Structure 


1. Sequence of estimation of component models in model system: 

On the first instance of page 3-3 (there are tivo instances of each of pages 3-1, 3-2, and 3-3), it is 
stated that there was an initial model development step which started at the top of the hie¬ 
rarchy and worked downward; this necessitated the use of a “substitute accessibility or imped¬ 
ance measure”. This top-down step was followed by a second estimation step, which was in 
the usual bottom-up direction. The substitute measures were replaced by the actual measures 
in that second step. It is not clear ivhy a top-down step was used initially, as the coefficients 
estimated in that step are not the final ones, and do not serve any purpose that we can discern. 
The bottom-up estimation step does not use these initial estimates in any manner, nor does it 
need to. Can the modelers explain why this unusual first step approached was used? 

The initial step in the estimation process we followed for this project is not an unusual first step 
when confronted with the lack of accurate skims/logsums at project start-up. 

The use of simplified impedance measures for a “first cut” of the logsum variable is a common 
practice when confronted with the practical realities of travel modeling. Few agencies publish 
the details of their model estimation procedures; however, the use of substitute impedance 
measures to simplify model applications is documented for other areas 18 . 

In addition, due to schedule considerations, the initial versions of the various models (mode 
choice, destination choice, frequency, etc.) were developed in parallel before the accessibility 
measures from the lower-level models were available. 

The approach provided a practical approach for reducing overall model estimation time. It also 
provided a useful proof of concept step since it was possible to get initial estimates of the model 
coefficients using the substitute accessibility and impedance measures. 

The final specification and estimation of the models were done sequentially using the bottom- 
up approach. During model estimation, the proper logsum measures of composite utility were 
calculated from the lower-level models and used as inputs in higher level. 


18 Activity-Based Travel Model Specifications: Coordinated Travel - Regional Activity Based Modeling 
Platform (CT-RAMP) for the Atlanta Region, Atlanta Regional Commission, March 2009, page 31. 

Bradley, M., and J. L. Bowman, Design Features of Activity- Based Microsimulation Models for U.S. 
Metropolitan Planning Organizations - A Summary, in Innovations in Travel Demand Modeling, Summary 
of a Conference, Volume 2 - Papers, Transportation Research Board, Washington, D.C., 2008, page 19. 
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There is no airport/station choice model. Apparently all trips are assumed to be made from the 
closest station/airport based on the local netivorks. In some cases, moreover, the closest sta¬ 
tion/airport may vary according to access mode (e.g., SFO and OAK). It is not clear hozv this is 
handled. 

An airport station/station choice model provides a level of detail that was not critical for 
meeting the objectives of the HSR Ridership and Revenue Study sponsored by MTC. 

A station model refers to a lower-level decision compared to the main mode choice decision. 
Therefore, it was unlikely it would affect the ridership forecasts for the overall system or for dif¬ 
ferent alignments developed using the HSR ridership and revenue model. 

The approach used relies on building transit and auto paths between Traffic Analysis zones to 
determine the most likely access and egress station. A similar process was used to determine 
the closest airport. These paths take into account the level of service of the access and egress 
modes and the frequency of service at each station and airport. 

2. Accessibility measures for intraregional trips: 

It is stated, on the second instance of page 3-1, that intraregional models maintained by the 
MPOs do not include destination choice models, and therefore do not produce logsum meas¬ 
ures. This necessitated the use of the “substitute accessibility measures” (mentioned above) for 
the intraregional trips in the final estimation of the trip frequency models. Four such measures 
were used (one for each market segment: auto-peak, auto off-peak, non-auto-peak and non¬ 
auto-off-peak). The functional forms for these measures are presented without any justifica¬ 
tion. What is the theoretical or empirical basis for these functional forms? 

The empirical basis of these functional forms is that they offer simplified versions of the full- 
fledged destination choice logsums. Complete logsums cannot be estimated in this case because 
destination choice models are not part of the intraregional MPO models. 

The simplified formulation has been used in various urban activity-based model systems 19 . 
These accessibility measures are relatively easy to calculate while still providing good approxi¬ 
mations to more “complete” logsums that would have been used if the intraregional MPO mod¬ 
els included a destination choice model formulation. 


19 Ibid. 
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Part 3: Model specification and estimation 


1. Market Segmentation 

Each model is segmented on short and long trips and these are estimated separately. This 
causes problems on the long/short trip boundary (100 miles) as can easily be seen in Figure 3.4 
in the Model System Development Final report dated August, 2006. The “net utility” (presuma¬ 
bly the index function in the choice models?) for short business trips declines from 0 to -4 at 100 
miles, and then jumps back up to -1.7 before declining again and finally rising starting at 250 
miles. Similar behavior occurs for the other trip types. This discontinuity may cause strange 
forecasting behavior for trips close to 100 miles. Was this issue addressed? 

We disagree that the segmentation of trips into long trips and short trips ‘‘causes problems ” in 
model application. 

We believe that the information in Figure 3.4 is being misinterpreted, and that the reviewer did 
not carefully read the text on Page 3-17. The y-axis in Figure 3.4 is labeled “net effect”, not “net 
utility”. The text on Page 3-17 points out that the “net effect” shown in Figure 3.4 is “the collec¬ 
tive impact of all three distance coefficients on one’s destination choice”. 

The word “net” is used because these are the residual distance effects of the mode choice 
logsum components accounting for the travel time and cost sensitivity in the models. After 
accounting for all level of service-related contributions to the mode choice logsum, it is not 
obvious that the residual effect of distance should have any particular shape. Essentially, “net 
utility” is not the correct item to review. To get a truly informative plot by distance, one would 
need to calculate the mode choice logsums between all TAZ pairs, add the residual distance 
effects, and then average and plot the combined information by distance or distance band. 

The text states that “caution should be used in interpretation because a great deal of the imped¬ 
ance between origin-destination pairs is captured within the travel impedance term [from the 
mode choice modell and coefficient ”, not simply the distance terms shown in Figure 3.4 
(emphasis added). 

The primary reason that the distance effects were included in the destination choice models is 
the typical finding in practical, operating (urban) models that mode choice logsums by them¬ 
selves are not adequate to represent all of the separation effects between destinations. Trying to 
explain all of the separation impacts solely by the logsums leads to unreasonably high logsum 
coefficients. The formulation used for the interregional destination choice models parallels the 
formulation found in the large majority of advanced destination choice models used in metro¬ 
politan areas. 20 - 21 


20 San Francisco Travel Demand Forecasting Model Development, Destination Choice Models - Final Report, 
prepared for San Francisco County Transportation Authority, prepared by Cambridge Systematics, 
Inc., October 1, 2002. 

21 Appendix E - Travel Demand Model Technical Memorandum, prepared for Memphis Metropolitan 
Planning Organization, prepared by Kimley-Horn and Associates, Inc., March 2008. 
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On Page 3-2 of the “model system development” report, the market segments are defined. Does 
this imply that any off-peak business or commuting trips were ruled out? 

No, this is not the case. Time of day of travel was not used to remove observations from the 
model estimation dataset. 

The main distinction in model estimation was that all business and commute trips used peak 
period level of service information, and all other trips used off-peak level of service information. 

This is consistent with MPO practice to use peak period levels of service for home-based work 
trip distribution and off-peak skims for home-based non-work and non-home-based trip distri¬ 
bution. This modeling approach is used extensively in urban models when there is not an 
explicit time-of-day choice model. In a recent survey of 15 large MPOs, 11 agencies had used 
peak skims for home-based work trip distribution, including MTC, SC AG, MWCOG, DRCOG, 
and Las Vegas RTC 22 . 

2. Trip frequency models 

The report does not indicate which data sets were used for the development of these models: 
the SP data, the Caltrans RP data, the MPOs’ RP data or some combination of the above? 
(Note that this issue is partly clarified later, in the description of the destination choice mod¬ 
els; see next bullet). 

The report correctly states that the dataset used for the trip frequency models is “comprised of 
interregional trips from the California Statewide survey, the SCAG survey, the SACOG survey, 
and the MTC/BATS survey” 23 . 


22 RTC 2004 Regional Travel Demand Model, prepared for Regional Transportation Commission of 
Southern Nevada, prepared by Parsons Corporation, September 2004, page 7-5. 

Integrated Regional Model - Model Refresh Project, Chapter 5: Mode Choice Model Documentation, 
prepared by Denver Regional Council of Governments, December 2004, pages 4-5. 

Charles L. Purvis, Travel Demand Models for the San Francisco Bay Area (BAYCAST-90) - Technical 
Summary, Metropolitan Transportation Commission, Oakland, California, June 1997, page 16. 

23 Cambridge Systematics, Inc. (August 2006), Bay Area/California High-Speed Rail Ridership and Revenue 
Forecasting Study: Interregional Model System Development, Final Report, p. 3-15. 
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There is some inconsistency betiveen the statements made at the bottom of page 3-15 and that 
made earlier on (the first instance of) page 3-2 regarding the datasets used to develop the trip 
frequency models. On page 3-2, it was stated that the estimation dataset for these models were 
the stateivide survey diary-days. On page 3-15, ive learn that the dataset used for the trip fre¬ 
quency models consisted of the California Stateivide survey, the SCAG survey, the SACOG sur¬ 
vey, the MTC survey combined with stated-preference survey. It should be noted that the 
sample sizes shown in tables 3-3 and 3-4 seem to indicate that the second description is the cor¬ 
rect one. Is this the case? 

As noted in the prior response, the trip frequency models were estimated from a combined 
dataset comprised of the three MPOs RP data and the Caltrans RP data. This dataset descrip¬ 
tion was mentioned on page 3-15: 

“The dataset used for the trip frequency models (comprised of interregional trips 
from the California Statewide survey, the SCAG survey, the SACOG survey and 
the MTC/BATS survey)...” 

The trip frequency dataset did not include any information from the state-preference survey as 
implied in the question above. 

The use of aMNL model for trip frequency is statistically incorrect, because the number of trips 
(0, 1, 2 or more) is an ordinal (rather than categorical) dependent variable. MNL models are 
appropriate to use only when the dependent variable is categorical (e.g., modes, destinations, 
etc.). Tor ordinal variable, the correct modeling approach is an Ordered Logit or Ordered 
Probit model. Is there a reason why ordered models were not used? Were such models 
attempted? 

We do not agree that the trip frequency models were estimated in a manner that was “statisti¬ 
cally incorrect”. The method used was one of several model formulations that were considered 
for the trip frequency model including ordered and Poisson models. 

The analysis of the long distance travel survey data revealed a concentration of observations in 
the “no travel” and “one long distance trip” categories. There were simply insufficient choices 
of “two or more trips” to warrant or support the estimation of an ordered or Poisson model 
formulation for an ordinal dependent variable. 

As a result, the model was estimated as a binary MNL choice model comparing two “catego¬ 
ries” of long distance trips: “don’t make a long distance trip” versus “make a long distance 
trip”. The “two or more trips” subcategory basically represents those travelers who elected to 
not stay overnight at their destination. 

In the application of the model, the “two or more trips” choice was specified using the same 
model coefficients as the “make a trip” choice and a constant was calibrated to produce the cor¬ 
rect number of “two or more trips” trip makers (i.e., those who elected to not stay overnight at 
their destination). 
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The “Regional dummy variables”, described earlier on the same page, are set on the basis of the 
households’ residence. They do not refer to the trip’s destination (which is obvious since the 
dependent variable in the present model is trip frequency, not trip destination!). Therefore, the 
statement in the bullet before last on page 3-7 (which attributes the high MTC regional coeffi¬ 
cient to the Bay Area’s tourist and other attractions) is ivrong. The correct interpretation of 
that coefficient is that, all else equal, Bay Area residents make more trips outside of their 
region relative to residents of other regions. This result is the opposite of what the modelers 
expected (as stated at the top of the same page). Can this unexpected result be explained? 

We agree with the reviewers’ point. The text is not consistent with the estimation results 
reported in the summary table. Also, little emphasis should be placed on the interpretation of 
regional constants which simply reflect what is left unexplained by the model. 

For clarification, we want to point out that the regional dummy variables shown in the report 
are based on the model estimation and are not the final values. They were later adjusted as part 
of the calibration/validation process to reproduce observed overall levels of interregional trip 
making from the four areas. 

The final model coefficients and constants were included in the text file “coeffs.txt” that has 
been available from MTC since 2007. These final values were also summarized in tables 
included with a memorandum from George Mazur to Nick Brand, dated January 29, 2010. 

The logsum measures were constrained in the final model estimation, because (as stated at the 
bottom of page 3-7) they came out relatively small in the estimation. What was the basis for 
the value that the modelers selected? 

Logsum coefficients were constrained to the commute coefficient. The basis for use of this value 
is that the commute purpose had the smallest coefficient of all purposes and resulted in more 
conservative and stable estimates of induced travel. 

The report contains two totally different set of results for the Long Trips Frequency Models: 
Tables 3.2 and 3.3 (the two tables have the same title). To add to the confusion, the report 
states, on page 3-10, that Table 3.3 “presents the estimation results of the trip frequency models 
for short trips” (emphasis added). One additional source of confusion is that the discussion on 
page 3-10 is not consistent with either the results presented in Table 3.3 or Table 3.4! We would 
ivelcome a clarification of the proper table numbers and titles; without that, it is not possible 
to make sense of the text. 

We agree that the text requires editing to clarify these differences. 

Table 3.3 is the model estimation summary table for long trips (instead of Table 3.2). Table 3.4 is 
the model estimation summary table for short trips. The discussion on Page 3-10 needs to be 
revised to reflect the model coefficients and constants included in the text file “coeffs.txt”. These 
final values were also summarized in tables included with a memorandum from George Mazur 
to Nick Brand, dated January 29, 2010. 
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3. Destination Choice Models 


The combination of the Revealed-preference (RP) data (from the Statewide survey and regional 
MPO surveys) with the Stated-preference (SP) data was done without accounting for the differ¬ 
ences between these two types of data; this is confirmed by examining tables 3.9 and 3.10. 
What was the reason for ignoring the differences between the two data sources? 

We did not combine the RP and SP datasets to estimate destination choice models. 

For purposes of destination choice model estimation, only the revealed-preference data that 
included the actual locations of the origins and destinations of the trips were used for model 
estimation. 

Although the O-D trip information formed the basis for the stated-preference experiments, none 
of the stated-preference information was used in destination choice model estimation. Since the 
SP information was not used, there are no issues related to “differences betiveen the two data 
sources ” as stated in the comment. 

It is true, as stated in the middle of the page, that one cannot interpret the individual distance 
coefficients individually, but rather should consider their collective impacts. Accordingly, the 
collective impact of the distance coefficients in the destination choice models for long trips 
(Business and Recreation), shown in Figure 3.4, leads to incorrect predictions. These figures 
imply that, everything else being equal, destinations that are more than approximately 
250 miles apart become increasingly attractive with distance! Is there an explanation for this 
counter-intuitive residt? 

We believe that the reviewers have misinterpreted Figure 3.4 and drawn incorrect conclusions. 

The y-axis in Figure 3.4 is labeled “net effect”, not “net utility”. The text on Page 3-17 points out 
that the “net effect” in Figure 3.4 is “the collective impact of all three distance coefficients on 
one’s destination choice”. The text goes on to state that “caution should be used in interpretation 
because a great deal of the impedance between origin-destination pairs is captured within the travel 
impedance term [from the mode choice model! and coefficient” , not simply the distance terms shown 
in Figure 3.4 (emphasis added). 

In summary, the distance effects are residual, and should not be interpreted apart from the 
mode choice logsum effects. 
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4. Access/Egress Mode Choice Models 


The data set used for model estimation is not specified precisely: it appears that the SP data 
and some RP data were used, but it is not clear which subset of the latter were included. Can a 
clarification be provided? 

All of the data used for estimation of the access/ egress mode choice models came from the new 
surveys collected for this project and included both stated-preference and revealed-preference 
survey responses. As detailed in the Interregional Model System Development Report: 

“The access and egress mode choice models were based on actual reported and hypo¬ 
thetical stated data. For people who were intercepted making actual air or rail journeys, 
the access and egress mode choices are the actual reported ones. For people whose 
actual journey was by car, the air and conventional rail access/egress mode choices are 
hypothetical. Obviously, the HSR access and egress mode choices are hypothetical for 
all respondents. So, each respondent provided up to 3 access choices and 3 egress 
choices, although most respondents only provided 2 of each, because conventional rail 
and air were only included together in the mode choice set for the LA-SD surveys.” 24 

Both Access and Egress models are Nested Logit (NL) models. The estimation residts of the 
tzvo models are shown in Tables 3.12 and 3.13. There are five market segment models for Access 
(and five for Egress): Long Trip Business, Long Trip Other, Short Trip Business, Short Trip 
Commute and Short Trip Other. By comparing the estimation results in the tzvo tables, one can 
see that the Nest’s logsum coefficients are exactly the same across the tzvo sets of models (i.e., 
for each market segment, the Nest logsum coefficient is the same for access and egress). With 
the exception of one model (Short-Trip-Other) where the logsum for the Nest is constrained, 
these coefficients are all estimated statistically! Can the modelers explain hozv the Nest 
Logsum coefficient estimates came out to be exactly the same across four pairs of models? 

The reviewer has correctly identified that the values reported in the Interregional Model System 
Development Report are from model estimation, and are not the final coefficients and constants. 
The nest logsum coefficients from model estimation were not correctly reported in Tables 3.12 
and 3.13 of this report. However, the final model coefficients and constants were included in 
the text file “coeffs.txt” that has been available from MTC since 2007. 

These final values were also summarized in tables included with a memorandum from George 
Mazur to Nick Brand, dated January 29, 2010. Based on an April 13, 2010 e-mail from George 
Mazur to Samer Madanat (Subject: Re: List of Reports), it was our understanding that the 
review team had the final model coefficients and constants: 

“From your e-mail, I surmise that your team has CS’ January 29, 2010 memoran¬ 
dum and the accompanying tables that show the final calibrated coefficients and 
constants. I also draw your attention to my March 3, 2010 memorandum to 
Medhi Morshed that explains the typographical error in the January 29 tables. 

The March 3 memorandum provides additional information about the model 


24 Ibid. p. 3-28. 
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calibration and validation process, including the Peer Review’s role in informing 
that process. I have attached the March 3 memorandum to this e-mail.” 

In summary, the nest logsum coefficients are not the same across pairs of models. In the access 
mode choice models, the four logsum coefficients range in value from 0.37 to 0.57, with one 
value constrained to 1.0. In the egress mode choice models, the five logsum coefficients range 
in value from 0.28 to 0.76. 

The same observation made above with respect to the Nest logsum coefficients applies to the 
Scale coefficients (SP/RP): the Scale coefficients for four of the models are constrained (they 
are constrained to be equal to 1.0 in both Access and Egress models without justification). The 
only scale coefficient that is statistically estimated is that for the Long-Trip-Business models, 
but here, the coefficient in the Egress model is exactly the same as that in the Access model. The 
question asked in the previous bullet applies here as well. 

As noted earlier, the final model coefficients and constants were included in the text file 
“coeffs.txt” that has been available from MTC since 2007. These final values were also summa¬ 
rized in tables included with a memorandum from George Mazur to Nick Brand, dated 
January 29, 2010. 

For the access mode choice model, most scale parameters were constrained to 1.0. As docu¬ 
mented in the Interregional Model System Development Report: 

“For most of the Access mode segments, the scale (the inverse of the residual error 
variance) for the hypothetical choices was not significantly lower than 1.0. It was only 
so for the Business Long segment, which is mainly air travelers. This result suggest that 
most people are fairly familiar with the travel options near their home, but that business 
travelers may be more familiar with the airport access situation than with possible 
access to rail stations. 

In contrast, for most of the Egress model segments, the scale factor on hypothetical 
choices is significantly less than 1.0. This result indicates that many respondents have 
difficulty making accurate tradeoffs for mode choice in less familiar surroundings at the 
non-home end of their trip, so that hypothetical choices should be weighted less in esti¬ 
mation than actual ones.” 25 

The scaling parameter was constrained to 1.0 in cases where it was clear that it was not signifi¬ 
cantly different from 1.0 and, that by constraining the scaling parameter, logsum parameters 
could be estimated. 


25 Ibid. p. 3-34 
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The text description of the model estimation results (on page 3-34) is inconsistent with the 
results presented in Tables 3.12 and 3.13. The text indicates that the Scale coefficients were not 
constrained but rather statistically estimated (and that the Scale coefficient estimates were 
significantly lower than 1.0 in the Egress models). Either the text or the Tables are correct. 
Which one is it? 

The text is correct and matches summary information in Tables 3.12 and 3.13 included with the 
January 29, 2010 memorandum from George Mazur to Nick Brand. 

5. Main Mode Choice Models 

The data set used for estimation of the main mode choice model consisted only of the SP survey 
data, despite the availability of a large RP data set (the interregional trips in the stateivide 
Caltrans survey and those in theMPO surveys)! Why was this decision made? 

The stated-preference dataset was used to assess the potential diversion of existing drivers and 
air travelers to high-speed rail. Although intercity rail is a currently available travel option for 
some origin-destination pairs and is present in the revealed-preference dataset, its characteristics 
are quite different than high-speed rail, which is only available in the stated-preference survey. 

The revealed data in the Caltrans and MPO surveys were so limited in terms of modes other 
than auto that they would have added only to the automobile sample. Information on travel by 
automobile in the stated-preference survey was sufficient for model estimation purposes. So, 
although additional data on auto use were available in the revealed-preference dataset, there 
were few additional records for the non-auto modes and, of course, no observations of high¬ 
speed rail usage. 

These issues, and the possibility of including revealed-preference data in the mode choice 
model estimation dataset, were discussed with MTC, HSRA, and the peer review panel. There 
was unanimous agreement that such an approach was not a priority and should not be pursued. 26 

Another concern with the SP data used for estimation is that each respondent provided res¬ 
ponses to several hypothetical choice situations. It is expected that the variance of the error- 
term increases with time (this trend has been confirmed in the literature, Reference). To 
account for this, a number of scale coefficients should be specified and estimated in the model. 
It appears that no such scaling was performed; what is the reason for this omission? 

We disagree with the statement that “scale coefficients should be specified and estimated in the 
model.” We appreciate the concern noted here and addressed it by including only four stated- 
preference choice exercises for each respondent. This approach was adopted to keep the survey 
burden manageable and avoid the type of fatigue effect referred to by the reviewers. 

Mark Bradley, a subconsultant for the model development project, published some of the origi¬ 
nal research on that effect. 27 The fatigue effect is not a typical consideration in applied 


26 Combining revealed- and stated-preference data to calibrate model for HSR was also outlined in 
Cambridge Systematics, Inc. (May 2005), Bay Area/California High-Speed Rail Ridership and Revenue 
Forecasting Study: Work Plan, Final Plan, p. 1-14. 
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transportation stated-preference studies since these surveys are generally short. Also, no 
adjustment was made to the standard errors for the repeated measurements per respondent. 
Again, using only four choice experiments per respondent suggests that such an adjustment 
would not be necessary and would not affect the estimation results obtained with the unad¬ 
justed standard error. 

As with the other models in the report, the number of constrained coefficients is large. We 
understand the need to set the alternative specific constants to match baseline validation mode 
shares, but do not understand the justification for constraining key parameters like reliability, 
service headway, in-vehicle time, cost, etc. Can the modelers explain zvhy so many coefficients 
were unconstrained? 

It appears that the question had a typographical error, and that you meant to ask “can the mod¬ 
elers explain why so many coefficients were constrained ”. Our response reflects this interpretation of 
the question. 

The debate whether to constrain coefficients to values that are different than those obtained 
during model estimation arises often in practical applications where estimated models are 
examined and then calibrated before they are applied to forecast ridership. Constraining coeffi¬ 
cients is undertaken only to improve the robustness of the model by enhancing its internal con¬ 
sistency and the ability to replicate existing travel patterns. 

There are many examples where models use constrained coefficients to reflect the experience 
from other similar projects and study areas. In the Federal New Starts program, the Federal 
Transit Administration (FTA) places priority on robust forecasts instead of detailed model esti¬ 
mation. To support this approach, the FTA recommends a range of constrained values for key 
service and cost coefficients to guide the model application using a robust modeling tool 28 . 

In this project, constraining coefficients required the model development team to apply its pro¬ 
fessional judgment to calibrate the model and develop a robust tool for application. We 
weighted statistical evidence from the data against the sensitivity of the model, literature that 
reflects previous evidence, and the ability of the model to predict observed travel patterns. 

The application of the unconstrained models obtained directly from model estimation did not 
satisfactorily replicate the observed conditions in the base year. To match the control totals by 
focusing only on the modal constants would have required major changes in the values of the 
modal constants that would have a major adverse impact on the policy sensitivity of the model. 


27 Bradley, M., and A. Daly (1994), “Use of the logit scaling approach to test for rank order and fatigue 
effects in stated-preference data,” Transportation, Volume 21, pp. 167-184. 

28 Travel Forecasting for New Starts, A Workshop Sponsored by the Federal Transit Administration, 
March 23-25, 2009, Tampa, Florida, FTA New Starts guidance. 
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The rationale for evaluating the estimated models and constraining the individual coefficients 
in the mode choice model can be summarized as follows: 

• Reliability was constrained because it was difficult to measure in the survey and the results 
did not meet expectations from other studies. The reliability measure was discussed by the 
peer review panel, and it was suggested that we consider revising the measure to be con¬ 
sistent across modes (within 60 minutes of scheduled time) during model calibration, and 
that we consider constraining this variable to account for the change in definition 29 . 

• Service headway (frequency) was constrained during model calibration to address an over¬ 
estimation (compared to observed base year data) of air trips in markets with low frequency 
air service and an underestimation or air trips in markets with high frequency air service. 
Service headway coefficients were set to match in-vehicle time coefficients based on profes¬ 
sional judgment of the model development team. This constraining was deemed to be a 
more reasonable approach than use of higher mode-specific constants that would have a 
greater impact on the sensitivity of the model. The merits of different potential interpreta¬ 
tions and values for the headway coefficient were documented in draft and final versions of 
the model development report (page 3-36 in the final version). The value of the constrained 
headway coefficient was within the range of reasonable values presented to peer review. 

• The constraining of the cost and in-vehicle travel time values for short business and com¬ 
mute segments is described in detail in the model development report (page 3-36): “For the 
three largest segments, the cost and in-vehicle time parameters were estimated noncon- 
strained and give very reasonable values of time. For the Short Business and Commute 
segments, the original in-vehicle time coefficients were quite low, and so were constrained 
to give values of time that seem more in line with other models. In general, VOT for the 
longer, more expensive trips is higher than for the shorter, more frequent trips. This is a 
typical result.” 

In summary, the need to constrain individual coefficients reflected the objective of the devel¬ 
opment team to better replicate existing travel patterns, maintain the policy sensitivity of the 
models, and enhance the robustness of model application. 


29 


Cambridge Systematics, Inc. (July 2006), Bay Area/California High-Speed Rail 
Forecasting Study: Findings from Second Peer Review Panel Meeting, Draft Report, p. 
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Memorandum 


TO: 

Samer Madanat 

FROM: 

George Mazur 

DATE: 

June 24, 2010 

RE: 

Response to Questions 


This memorandum provides responses to the three clarification questions included in your June 
11, 2010 correspondence. Please contact us if you have any further questions. 


Question #1: Data Sets for Trip Frequency and Destination Choice Models 

New Question 

This response contradicts the text at the bottom of page 3-15 and the top of page 3-16 of the final 
report, which is shown above under "Our observation". It also contradicts the information 
given on (the first instance of) page 3-2 of the final report, third bullet: 

"Destination choice models-...The model input data are a mix of trips from the 
statewide survey and the SP survey. ..." 

Either the Final report or the CS response is correct; we would welcome a clarification. 

Response 

There appears to be confusion stemming from different usages of the term "SP survey". "SP 
survey" and "SP dataset" were used interchangeably in the Interregional Model System 
Development (IMSD) Final Report to refer to both the overall survey effort conducted in 2005 as 
part of the model development project as well as specific subsets of data from that survey. We 
understand that this usage is causing confusion. 

To clarify, we suggest using the phrase "HSRA 2005 Statewide Surveys" instead of "SP Survey" 
when referring to the overall survey effort. The "HSRA 2005 Statewide Surveys" includes both 
a revealed preference (RP) dataset and a stated preference (SP) dataset. The RP dataset includes 
information about the survey respondent and the specific observed trip that served as the basis 
for constructing the respondent's customized choice exercises. The SP dataset includes the 
values presented to the respondent in each choice exercise as well as the actual choice made by 
the respondent in each exercise. The table below uses this naming convention to illustrate the 
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survey data used for estimating each model component. Regarding your prior question about 
combining "Revealed Preference (RP) data (from the Statewide survey and regional MPO 
surveys) with the Stated Preference (SP) data" for the destination choice model, the table below 
shows that such a combination did not occur. Only RP data were used in estimating the 
destination choice model. 

Datasets Used for Model Estimation 


Travel Model Element 

Caltrans 

2000 

Statewide 

Survey 

MPO 

Household 

Surveys* 

HSRA 2005 Statewide Surveys 

Revealed Preference 
(RP) Dataset 

Stated Preference 
(SP) Dataset 

Trip Frequency Models 

V 

V 



Destination Choice Models 

V 

V 

V 


Access/Egress Models 



V 

V 

Main Mode Choice Models 




V 


Includes interregional travel records from MTC, SACOG and SCAG household travel surveys. 


Question #2: Access and Egress Models 

New Question 

According to the memorandum George Mazur to Nick Brand, dated January 29, 2010, the 
model structure was finalized in April 2007. By this, we understand that the coefficient 
estimates included in the text file "coeffs.txt" date from approximately April 2007. 

The Interregional Model System Development report has a date of August 2006 (eight months 
prior to finalization of the model), yet the text from page 3-34 refers to the values of the final 
model coefficients (as made clear in the CS response). Does the report refer to results that were 
obtained several months after it was prepared? 

We believe a clarification of this apparent contradiction is needed. 

Response 

The Interregional Model System Development (IMSD) Final Report does not refer to results 
obtained several months after its publication. The January 29, 2010 memorandum identifies the 
final model coefficients, while the IMSD Final Report (August 2006) identifies the initial 
estimated coefficients prior to initiation of model calibration and validation. As it turns out, the 
estimated access and egress coefficients did not change in the calibration/validation process, 
and the final coefficients are the same as those presented in Table 3.13 in the IMSD Draft Report 
that was sent to the Peer Review Panel. During preparation of the IMSD Final Report, 
information from Table 3.12 was inadvertently copied into Table 3.13, and overwrote the correct 
table values for the egress mode choice coefficients that appeared in the IMSD Draft Report. 
While the text in the IMSD Final Report is correct, we have identified that the information in 
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Table 3.13 of the Final Report is not correct. The values for Table 3.13 in the IMSD Final Report 
should be the same as the ones presented in the IMSD Draft Report and in the January 29, 2010 
memorandum because the egress coefficients did not change. 


Statewide Model Validation 

New Question 

The procedure followed for calibrating the mode-specific constants is described in the report 
titled "Statewide Model Validation", prepared by Cambridge Systematics and dated July 2007. 

Page 5-13 contains a description of the procedure used for calibrating the constant for the HSR 
mode. This description is not clear. We would welcome a more detailed, step by step 
description of this procedure. 

Response 

During model validation the constants for the existing auto, air, and rail intercity modes were 
calibrated to reflect existing market shares and levels of service. The procedure for determining 
the constants for high speed rail (HSR) was informed both by the model estimation results and 
by the final calibrated values of the constants for existing modes. 

In addition, HSR constants were carefully assessed to avoid "optimism" bias that is often 
present in forecasts for new modes and facilities. For this detailed response, we first briefly 
discuss this consideration, then present the model validation results for the existing intercity 
modes, and finally discuss where HSR constants are positioned within the range of constants 
for existing intercity modes. 

Optimism Bias 

Past experience with forecasting ridership for new urban and intercity rail projects suggests the 
presence of optimism bias. Large constants for new rail modes often reflected a high degree of 
anticipated adoption by travelers. The application of models without careful consideration of 
the potential for optimism bias can lead to ridership forecasts that may prove too optimistic. 

Examples of the impact of optimism bias are presented by Pickrell who wrote a seminal paper 
in 1992 discussing rail ridership forecasts that substantially exceeded the experienced 
ridership 1 . Flyvbjerg, et al 2 have also examined mega projects where utilization forecasts 
proved too high and construction and operating costs were underestimated. 


1 Don Pickrell (1992), "A Desire Named Streetcar: Fantasy and Fact in Rail Transit Planning," Journal of the American 
Planning Association, Vol. 58, No. 2, Spring, pp. 158-176. 

2 Flyvbjerg, Bent, Mette K. Skamris Holm, and Soren L. Buhl, 2005, "How (In)accurate Are Demand Forecasts in 
Public Works Projects?" Journal of the American Planning Association, 71(2), spring 2005. 
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In the case of forecasting for urban rail projects in the US, the use of optimistic rail constants 
made the comparative analysis of project funding applications very difficult. In response, the 
Federal Transit Administration issued guidance on constraining model coefficients within a 
tight range 3 . The FTA also asked applicants to treat the attractiveness of the proposed rail 
service as equivalent to an enhanced bus service. These tight rules were aimed at developing a 
level playing field for all applicants and minimizing the effect of optimism bias 4 . 

The key lesson learned from these studies is that caution needs to be exercised when evaluating 
the potential of a mode that does not currently exist. We have used this approach to carefully 
evaluate the attractiveness of the HSR travel option whose adoption by the public has not been 
tested in California and in most of the US. 

Calibration of Constants 


The mode choice model calibration is an iterative procedure. Since observed target shares for 
auto, air, and rail trips could be estimated from base year data, the calibration of the constants 
for those modes could take place using standard calibration techniques. Specifically, the natural 
log of the ratio of target mode shares to modeled mode shares for the current iteration was used 
as an adjustment factor to the constant for the current iteration: 


•' 71+1 


- C n + 


J n ( Target N 
\Modeled n J 


where: 


C is the constant 
n is the iteration 

Target is the target share for the mode 
Modeled is the modeled share for the mode 

The constants were adjusted for each iteration so that the next iteration auto constant was zero. 
This adjustment is performed by simply subtracting the value of the auto constant from each 
constant. This step was performed for the three existing intercity modes. 

The HSR constants were initially developed during model estimation. This development 
included initial model estimation that was reported in the Interregional Model System 
Development report, and a final model estimation that was completed in conjunction with 
refinements to model coefficients prior to initiation of calibration The resulting HSR constants 
were finali z ed following the validation stage where the constants for existing air and rail service 
were determined. The final HSR constants approximate the relationship to air and rail 
constants that resulted from the final model estimation using the stated preference surveys: 


3 FTA Recommendations for New Starts projects discussed at 

http:// www.fta.dot. gov/planning/newstarts/planning_environment_5615.html 

4 It should be noted that the FTA has recently relaxed the rules on the values of rail constants to some extent 
recognizing a small advantage of rail over bus alternatives. 
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(ChSR~CR ail)Final ^ ( c HSR- c Rail)Estimated 
(£Air — CRail) Final (F Air Rail) Estimated 


The Context for HSR Constants 


The calibrated constants for existing auto, air, and rail modes reflect the market shares for each 
intercity mode for different trip length and trip purpose combinations and the level of service 
offered by each competing intercity mode across all origin-destination markets. These constants 
can be interpreted as the unobserved attributes of each mode over and above the level of service 
they offer. They can also be viewed as the residual effect that is left unexplained by the model 
and the explanatory variables used in the model specification. 

In the long distance business market, auto travel is dominant with a market share of 88 
percent. The constants for the existing air and rail alternatives are strongly negative to reflect 
an 11.5 percent market share for air and a much smaller share of 0.3 percent for intercity rail. 

The final estimation results suggested that the HSR constant was positioned between the 
constants for existing rail and air service. This relationship has been maintained and the HSR 
constant has been positioned between the calibrated air and rail constants as shown in Table 1. 

The observed market shares in the long distance recreation/other travel market are similar with 
auto at a market share of 88 percent, air at 11 percent and rail at just over one percent. The final 
estimation results for this market segment produced a high value for the HSR constant. To 
avoid the concern of optimism bias, the HSR constant was positioned between the existing air 
and rail constants in a manner consistent with the long distance business market. 

In the short distance business and commute markets, auto travel commands a market share of 
99 percent with conventional rail carrying the remaining one percent. In the short distance 
recreation market, conventional rail carries only 0.1 percent of the market with the rest 
dominated by auto travel. 

For each of the three short distance travel purposes, the final calibrated constants for 
conventional rail are strongly negative to reflect the small market share and the current level of 
rail service. The HSR constants are assumed to be at a level that is comparable with existing rail 
service given the short distance nature of the service. The estimated and calibrated constants 
for short-distance travel markets are shown in Table 2. 

In summary, in each of the intercity travel markets the HSR constants have been determined by 
the final model estimation results and the final set of calibrated constants for air and 
conventional rail services. This iterative process has been informed by the key consideration to 
determine HSR constants to minimize the negative impacts of optimism bias. 
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Table 1. Modal Constant Values for Long Distance Mode Choice Models 



Business/ Commute - Long Distance 

Recreation/ Other - Long Distance 


Initial 

Model 

Estimation 1 

Final Model 
Estimation 2 

Final Model 
Calibration 3 

Initial 

Model 

Estimation 1 

Final Model 
Estimation 2 

Final Model 
Calibration 3 

Auto 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

Conventional 

Rail 

-0.387 

0.013 

-3.974 

0.615 

0.733 

1.656 

HSR 

-0.3503 

-0.329 

-5.860 

1.434 

1.546 

-0.181 

Air 

-1.645 

-1.366 

-7.506 

0.690 

0.795 

-3.086 


Sources: 


1 Interregional Model System Development - Final Report ; Cambridge Systematics, Inc; August 
2006; Table 3-15. 

2 Final model estimation results; Cambridge Systematics, Inc.; August 2006. 

3 Statewide Model Validation - Final Report; Cambridge Systematics, Inc; July 2007; Table 5-4. 


Table 2. Modal Constant Values for Short Distance Mode Choice Models 



Business - Short Distance 

Commute - Short Distance 

Recreation/ Other 
Distance 

- Short 


Initial 

Final 

Final 

Initial 

Final 

Final 

Initial 

Final 

Final 


Model 

Model 

Model 

Model 

Model 

Model 

Model 

Model 

Model 


Estima- 

Estima- 

Calibra 

Estima- 

Estima- 

Calibra- 

Estima- 

Estima- 

Calibra- 


tion 1 

tion 2 

-tion 3 

tion 1 

tion 2 

tion 3 

tion 1 

tion 2 

tion 3 

Auto 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

Conv. 

Rail 

-0.268 

-0.328 

-4.432 

4.232 

4.742 

-6.226 

-0.385 

-0.470 

-5.025 

HSR 

-1.557 

-1.626 

-5.030 

4.048 

4.558 

-5.714 

0.504 

0.411 

-4.968 

Air 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 

0.000 


Sources: 


1 Interregional Model System Development - Final Report ; Cambridge Systematics, Inc; August 
2006; Table 3-15. 

2 Final model estimation results; Cambridge Systematics, Inc.; August 2006. 

3 Statewide Model Validation - Final Report; Cambridge Systematics, Inc; July 2007; Table 5-4. 
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CA High-Speed Rail Authority Response Letter and Cambridge Systematics 

Comments on Draft Report 

June 25, 2010 



Curt Pringle, Chairman 
Tom Umberg, Vice-Chair 
Russell Burns 
David Crane 
Rod Diridon, Sr.* 

Fran Florez* 

Richard Katz 
Judge Quentin L Kopp* 
Lynn Schenk 
•past chair 



ARNOLD SCHWARZENEGGER 
GOVERNOR 



CALIFORNIA HIGH-SPEED RAIL AUTHORITY 


June 25, 2010 


Professor Samer Madanat, Director 

Institute for Transportation Studies 

Department of Civil and Environmental Engineering 

109 McLaughlin Hall 

University of California 

Berkeley, CA 94720 


Dear Professor Madanat: 

The California High-Speed Rail Authority appreciates receiving the Draft Report 
reviewing the “Bay Area/Califomia High-Speed Rail Ridership and Revenue Forecasting Study: 
Interregional Model System Development Final Report.” As described in the scope of work for 
this project, the Authority has conferred with Cambridge Systematics to prepare written 
comments on the Draft Report. The attached written response from Cambridge Systematics is 
directed at the technical modeling issues identified in the Executive Summary at pages 1 -2 and 
discussed in the body of the Draft Report, pages 4-7. 

Authority staff has carefully reviewed both the Draft Report and the Cambridge 
Systematics response. We believe Cambridge Systematics has provided a direct and credible 
response to each technical point raised and that the ridership model has been, and continues to 
be, a sound tool for use in high-speed rail planning and environmental analysis. In light of your 
conclusion that Cambridge Systematics has followed generally accepted professional standards 
in the modeling work, we anticipate your thorough consideration of its response to the technical 
modeling issues in the preparation of your Final Report. 

The Authority also wishes to comment on the non-technical, final conclusion offered in 
the Executive Summary of the Draft Report, which states: “the forecasts of high speed rail 
demand - and hence the profitability of the proposed high speed rail system - have very large 
error bounds. These bounds may be large enough to include the possibility that the California 
HSR may incur significant revenue shortfalls.” This is an extraordinary statement for which we 
find no foundation in the Draft Report. 


We strongly urge the review team to carefully consider this letter and the attached 
response from Cambridge Systematics as part of a constructive discourse on the ridership model. 



Chief Executive Officer 
Attachment 
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June 25, 2010 


Mr. Tony Daniels 

Parsons Brinckerhoff 

303 Second Street 

Suite 700 North 

San Francisco, CA 94107-1317 

Re: Institute of Transportation Studies June 15, 2010 Draft Report 
Dear Mr. Daniels: 

Thank you for the opportunity to review and comment on the Institute of Transportation 
Studies' Draft Report (ITS Draft Report) entitled Review of “Bay Area/California High-Speed Rail 
Ridership and Revenue Forecasting Study: Interregional Model System Development Final Report .” 

Cambridge Systematics, Inc. (CS) appreciates the authors' acknowledgment in the report that 
our model development team "followed generally accepted professional standards in carrying 
out the demand modeling and analysis." 

Since its founding in 1972, CS has been at the forefront of professional practice for developing, 
validating and applying travel demand models for use at the local, regional, state and national 
levels. We have also developed courses on advanced travel demand forecasting techniques, 
survey data collection and travel model validation for the Federal Highway Administration. A 
few examples of the statewide models developed by CS include models for Indiana, 
Massachusetts, Florida, Wisconsin, New Mexico, New Hampshire, Georgia, and Cal if ornia. 
Our experience also includes the development of inter-regional models for the Colonia Bridge 
linking Argentina and Uruguay, and for the Illiana corridor linking Illinois and Indiana. CS 
developed a national model in Italy and applied it to estimate high speed rail forecasts for the 
Torino-Milano-Napoli corridor proposed for TVA (Treno Alta Velocita). In addition to its work 
for the California High-Speed Rail Authority (the Authority), CS also studied the potential of 
high speed rail corridors proposed in Florida, the corridor linking Boston and Albany, and the 
corridor between Boston and Montreal. CS staff have also participated in the earlier high speed 
rail study for the Texas Triangle, a study of the economic benefits of high speed rail for the 
Northeast corridor, and the determinants of demand for airline travel. 

As noted in the ITS Draft Report, we relied on the expert experience and judgment of our CS 
modeling staff, our teaming partners, the peer review team, and the client’s project manager 
(Chuck Purvis, formerly with the Metropolitan Transportation Commission, and widely 
regarded as one of the foremost authorities on travel demand forecasting practice in the United 
States). A good model development effort relies on the collective experience and judgment of 
the project team in order to properly apply theory so that the resulting model meets its intended 
objectives. We did that. 


100 CambridgePark Drive, Suite 400 
Cambridge, MA 02140 
www.camsys.com 


tel 617-354-0167 


fax 617-354-1542 






Mr. Tony Daniels 
June 25, 2010 
Page 2 


Except for the authors’ acknowledgment that the CS team followed generally accepted profes¬ 
sional standards, we find the ITS Draft Report deficient in significant, substantive ways, and we 
emphatically disagree with the authors’ conclusions that the model is not reliable. 

The ITS Draft Report focuses on academic viewpoints and ignores what it takes to create a 
model for real-world application. We also conclude, as indicated by the title of the ITS Draft 
Report and the its content, that the authors based their arguments substantially on a review of 
one document without considering the many other reports and model files in the substantial 
project record that were provided to them. Of even more concern is the fact that the ITS Draft 
Report is filled with qualifications to the authors’ statements, such as “appears that,” “is likely 
that,” and “implies that,” yet on the basis of these statements the authors draw very definitive 
conclusions. The authors repeatedly state that the model contains “biases” and 
“inconsistencies” without further detail. They simply jump to the unfounded conclusion that 
the resulting model is unreliable. 

In summary, in reaching a conclusion of “bias and inconsistency” in model results, the authors 
misunderstand how we developed the model, rely on generalizations without the experiential 
understanding that comes from being engaged in the model development process, and make 
categorical pronouncements about model results without documenting any analytical 
foundation. 

As to the specific arguments presented in the ITS Draft Report, our more detailed responses that 
follow will show that, contrary to the authors’ conclusions: 

• The database developed in support of model development is representative of the traveler 
population; 

• The methodology we employed to adjust model parameters was correct when we devel¬ 
oped the model and remains standard practice today; 

• We did not change key parameters to accord with our a priori expectations; rather, the final 
values for key parameters are based on empirical evidence; 

• The calibrated value for the headway variable does not make the predicted shares of the tra¬ 
vel modes overly sensitive to changes in frequency; 

• The model properly accounted for a traveler's station choice options, and provides an 
appropriate comparison of ridership between the Altamont and Pacheco routes; and, 

• Sensitivity tests show that the model performs consistently with changes in input variables 
and that ridership forecasts fall within reasonable bounds based on comparisons to prior 
forecasting work and worldwide HSR ridership experience. 
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We appreciate and respect the fact that individuals can disagree on elements of model devel¬ 
opment. Such differences of opinion are inevitable when developing a model. It is our conten¬ 
tion that the broad arguments the reviewers make regarding the impact of elements about 
which they have a differing opinion do not bear out their conclusions. Even if we were to 
concede points they have made about elements of the model, and we do not, the impact on 
results would be immaterial to the ridership forecasts. 

Contrary to the conclusions reached by the authors of the ITS Draft Report, the HSR ridership 
and revenue model does produce results that can inform the Authority's decisions about HSR 
in California, including reasonable estimates of overall ridership levels to support 
environmental impact analysis and decisions on service areas and system alignment. This 
response to the ITS Draft Report, including details on subsequent pages, along with our 
responses to their earlier questions presents the reasons for our confidence in this state-of-the- 
practice model. 

CS stands firmly behind the travel demand model our experts have created. 


Sincerely, 


CAMBRIDGE SYSTEMATICS, INC. 



Lance A. Neumann 
President 


LAN/cjf/7946-013 
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Detailed Response to ITS Draft Report 

The following pages provide our responses to the specific criticisms made by the reviewers in 
the ITS Draft Report. 

Issue 1. Arbitrary Division of Trips into Long and Short Trips 

The ITS Draft Report claims that the adopted convention of developing separate models for 
short-distance (less than 100 miles) and long-distance (100 miles or more) "causes problems". 
We disagree with the reviewers' conclusion. 

We followed a widely accepted and proven market segmentation approach of stratifying the 
trips, in this case, into short and long trips. During the model design phase, the decision was 
made to distinguish between shorter and longer trips using an approach that has been used in 
the past in the design of the American Travel Survey and the National Household Travel 
Survey. In short, stratification of trips into short-distance and long-distance trips is standard 
modeling practice and does not “cause problems” in model application. 

As noted in our original response to the authors’ questions, the distance coefficient cannot be 
studied in isolation from other explanatory variables. The reviewers acknowledged that travel 
time enters the model through the logsum variable (and, implicitly, that travel time also impacts 
distribution), but they stop there. Many other components enter through the logsum, including 
travel cost, reliability, frequency of service for nonauto modes as well as modeled traveler cha¬ 
racteristics. In addition, other zonal characteristics, such as area type of the destination and 
interchange variables, are considered in the destination choice models. 

Market segmentation is commonly used in the development of travel models. For example, 
market segmentation based on discrete variables such as households by size, households by 
autos owned, or number of workers is easy and straightforward, and is common in trip genera¬ 
tion in urban models. 

Market segmentation also is a commonly used practice for continuous variables. Perhaps the 
most widely known example is income level. Household income is a continuous variable but 
households are frequently stratified by income quartile or other groupings for modeling 
purposes. 

Issue 2. Assigning all Business Trips to the Peak Period 

The ITS Draft Report claims that the use of peak period service levels for all business/ commute 
travel is "potentially a serious problem" for an interregional travel demand model. We disagree 
with this conclusion. In fact, this approach is reasonable and acceptable for both regional and 
interregional modeling. 
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The reviewers acknowledge that it is standard practice in model development to assign 
business and commute trips during the peak period. This convention simply reflects the fact 
that the majority of business travelers and commuters travel during the peak period and face 
peak-period levels of service. This pattern holds both for regional and interregional travel as 
the reviewers note. Therefore, using peak-period travel impedances for the business and 
commute travel market is a reasonable approach for a planning-level model for interregional 
travel. 

Issue 3. Incorrect Treatment of the Panel Data Set in the Main Mode Choice Model 

The ITS Draft Report states that "[i]t appears that Cambridge Systematics treated each SP 
response as independent and thus ignored likely serial correlation". According to the 
reviewers, such treatment may lead to "inflated t-statistics". Nonetheless, the reviewers 
conclude that "the parameter estimates are still consistent." We do not agree that the serial 
correlation presents an issue particularly given the reviewers' acknowledgement that 
"parameter estimates are still consistent". The resulting model coefficients are consistent and 
their relative values are correctly estimated. 

Each respondent received only four stated preference choice experiments and this approach 
was used consistently across all market segments. Furthermore, serial correlation does not 
affect the relative values of the coefficients or the final model results. Moreover, the stated 
preference survey design and model estimation was led by one of the pioneers in the use of 
stated preference data for travel model estimation, Mark Bradley. He literally "wrote the book" 
on how to collect and use such data, and has been doing so in practice for over 20 years. 

The implication of the reviewers’ comment is that it is possible that some model coefficients 
appear to be more “statistically significant” than they should be. Even if this were the case, the 
coefficient values would not change. Therefore, this issue has no effect on the final models. 

Issue 4, Constraining the Headivay Coefficient 

The ITS Draft Report notes that the relative relationship between the coefficients for headway 
and in-vehicle travel time (IVTT), was different during initial model estimation than during 
model calibration. The reviewers contend that constraining this coefficient led to "bias in the 
model forecast" because the headways for interregional service are much longer than for urban 
travel, resulting in different arrival patterns for travelers at HSR stations compared to urban 
transit systems. We disagree with the assertion that planned headways for California HSR are 
substantially different than for urban rail service. Accordingly, we believe that the treatment of 
sensitivity to wait times and headway is reasonable and does not introduce any biases. 

The frequency of air and rail service has an impact on the time that travelers expect to wait at a 
terminal or a station, and on the convenience with which they are able to travel close to their 
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desired departure time. Therefore, two separate components are used to reflect the impact of 
service frequency on travelers’ choice behavior. 

Based on observed data and expert input, average wait times of 55 minutes were established for 
air travelers and 15 minutes for HSR and rail travelers. Similarly, separate terminal processing 
times of 18 to 24 minutes were established for air travelers and 3 to 12 minutes for HSR and rail 
travelers. The sensitivity to wait and terminal time is twice as high as the sensitivity to travel 
time, consistent with the literature and practice. 

Beyond this traditional “wait-time” component, the sensitivity to headways was introduced as 
an additional component to reflect travelers’ anticipated reaction to schedule convenience. 

The proposed HSR service offers a new paradigm of interregional service. The proposed HSR 
headways are more comparable to the best urban rail services in the U.S. rather than current 
intercity air and passenger rail services. In this context, the value that was used for the 
headway coefficient was debated during the model estimation and validation process, and a 
value consistent with urban rail systems was determined to be appropriate given the planned 
frequencies of the California HSR system. 

During calibration of the original model, there was an overestimation of air trips in markets 
with low frequency of air service and an underestimation of air trips in markets with high- 
frequency air service. The merits of different potential interpretations and values for the 
headway coefficient were documented and discussed during the peer review process. The con¬ 
straint on the coefficient was deemed to be a more reasonable approach than introducing higher 
alternative-specific constants that would have a greater impact on model sensitivity. 

Finally, it also should be noted that the short headways and corresponding short wait times 
account for a small portion of the interregional air and high-speed rail travel times in this study. 
As a result, the impact of using different assumptions on coefficient values will be correspon¬ 
dingly small. 

Issue 5. Absence of an Airport/Station-Choice Model 

The ITS Draft Report contends that the adopted process for assigning travelers to individual 
airports and rail stations is "behaviorally unrealistic". The reviewers contend that the absence 
of a more elaborate modeling structure for airport/station choice "has a substantive impact on 
the comparisons of ridership for the Altamont and Pacheco corridors." We disagree with both 
points. Further, we believe that a more elaborate "airport/station choice model" is not critical 
for meeting the objectives of the model development and application work that has been 
conducted, nor for accurately distinguishing the ridership and revenue potential between the 
Altamont and Pacheco corridors. 

The model currently uses a network-based method that assigns an airport or rail station to all 
travelers originating from a specific zone. The rule that is used is based on evaluating paths 


CAMBRIDGE 


SYSTEMATICS 





Mr. Tony Daniels 
June 25, 2010 
Page 7 


from each origin zone to alternative airports and rail stations. The attractiveness of each path 
reflects the access modes that are available, the level of access service they offer, and the fre¬ 
quency of air and rail service available at each airport and rail station. 

An airport/station-choice model would allow the allocation of a proportion of travelers to dif¬ 
ferent nearby airports and rail stations. However, such an approach would have, at most, a 
minimal effect on Altamont’s rider ship, and then only for a few zones in the study area. 1 

We are providing an example to demonstrate the minimal effect that station choice would have 
for the base Altamont alignment (with split service between San Francisco and San Jose). 
Figure 1 shows a map of the central Bay Area with an overlay of the base Altamont alignment. 

The primary East Bay stations at Livermore and Bernal/1-680 do not experience split service in 
the base Altamont alignment. Therefore, residents of Alameda, Contra Costa, and Solano 
Counties, which comprise the primary catchment areas for the East Bay stations, have full HSR 
frequencies and would experience no ridership gain from including a station-choice model. 

Residents of Napa, Sonoma, Marin, San Francisco, and northern San Mateo Counties (north of 
Redwood City) would use an HSR station in San Francisco or along the Peninsula. Given the 
headways that are planned for the HSR system, even with split operations in the base Altamont 
alignment, it would be completely illogical for a resident of, say, San Mateo to pass up the 
nearby Redwood City HSR station and drive 25 miles further south to San Jose (or 30 miles east 
to Bernal/1-680). For residents of these five counties, a station-choice model would not increase 
ridership from these areas. 

A similar situation holds for many residents of Santa Clara County. Essentially, travelers 
residing in areas east of Sunnyvale are going to use either a San Jose or Warm Springs station 
given the HSR headways that are planned. Therefore, a station-choice model would not 
increase ridership from the majority of Santa Clara County. 

We are left with a small portion of the Bay Area between roughly Atherton and Sunnyvale that 
might achieve some small measure of ridership gain from inclusion of a station-choice model; 
this area is shown in blue cross-hatch in Figure 1. The ridership and revenue forecasts for the 
base Altamont alignment projected 555,000 annual HSR trips from this geographic area. Even 
in the very unlikely event that inclusion of a station-choice model would double the number of 
HSR trips from these com muni ties, that would less than 0.6 percent of total projected ridership 
of 87.9 million for the base Altamont alignment. This level of change is inconsequential. 

In summary, we disagree with the conclusion drawn, and believe that a station access model 
would potentially impact the results in only a small portion of the study area. 


1 A station-choice model would have no practical effect on overall ridership projections for the base 
Pacheco alignment since it has a single alignment through the Bay Area. 
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Figure 1 - Base Altamont Alignment 
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Issue 6. Calibration of the Alternative-Specific Constants 

The ITS Draft Report is critical of the procedures followed for calibrating mode specific 
constants from choice-based sampling, and claims that these procedures were reportedly found 
to be “wrong” in a paper written by Bierlaire in 2008, after development of the model between 
2005 and 2007. We do not agree with the statement made by the reviewers that the referenced 
paper is the definitive source on this question. We relied on widely accepted practice at the 
time, the approach we followed continues to be standard and accepted practice, and the 
procedure referenced by the review team from the Bierlaire paper has not become accepted 
practice. 

In our view, this point of criticism reflects an example of a classic academic debate that will go 
on for some time. As with most academic work that breaks new conceptual and analytical 
ground, the theoretical debate will continue before reaching a consensus which will eventually 
translate the insights from the new theory into accepted practice. 

Choice-based sampling offers a proven, efficient method to collect surveys from key market 
segments of interest such as current air passengers and rail riders. A long-established and 
widely accepted procedure has been used to calibrate the alternative-specific constants to 
account for the impact of choice-based sampling. With the exception of the alternative-specific 
constants that need to be adjusted, the coefficients of the explanatory policy-sensitive variables 
are unbiased and consistent. 

Consequently, we disagree with the argument that the method used to adjust the alternative- 
specific constants resulted in biased and inconsistent model parameters. 

Issue 7. Excessive Constraining of Coefficients 

There is not “excessive constraining of coefficients in the final models.” 

To address this issue, we need to distinguish between the necessary adjustments made to modal 
and regional constants during model validation versus the few constraints that we imposed on 
explanatory variables during model estimation. 

The rationale to constrain selected model coefficients reflects cases where empirical results from 
survey-based model estimation do not agree with other sources of empirical data. These data 
sources include observed air and rail ridership and market shares, interregional or local travel 
flows, or highway traffic counts. Contrary to the reviewers’ suggestion of the existence of clear 
and straightforward “empirical evidence,” model estimation results are often in conflict with the 
other sources of empirical data. We did not override a body of overwhelming empirical 
evidence but exercised reasonable professional judgment to rectify conflicts from different data 
sources. 
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During model validation we made adjustments to one set of modal constants and one set of 
regional constants. First, the modal constants were constrained during model validation to 
adjust for choice-based sampling and to reflect the existing market shares by each mode. 
Second, the airport-to-airport regional constants were used to more accurately reflect the inter¬ 
regional flows from air travelers, a key market segment for this study. 

There were only a few selected instances in the main mode choice models where constraints 
were considered and implemented for a few explanatory variables: 2 

• In the long-distance models, there are 13 coefficients in the business/commute model 
and 11 coefficients in the recreational/other model that reflect policy-sensitive explana¬ 
tory variables. Only 2 out of the 24 coefficients were constrained, including the head¬ 
way coefficient (already discussed under item 4) and the reliability coefficient. 

• In the case of the short-distance market, a market that is less important to high-speed rail 
ridership, a total of only 7 coefficients were constrained out of a total of 25 coefficients 
for policy-sensitive explanatory variables. 

In summary, we believe that the extent of constraining has been very modest. In the few cases 
where we faced the dilemma of whether to accept the model estimation results, we used judg¬ 
ment very selectively to develop a more credible and reliable model using other sources of 
empirical evidence and accepted practice. At no point was constraining undertaken to match 
“modeler’s beliefs” as stated by the reviewers. 


2 In our June 8, 2010 memorandum to Samer Madanat, we discussed in detail our rationale for constraining the 
headway and the reliability coefficients in the long-distance models and for imposing a few constraints in the short 
distance models. 
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This Appendix includes our responses to the comments offered by Cambridge 
Systematics to our draft report. These comments were included in a document attached 
to a letter dated June 25, 2010, from Dr. Lance Neumann to Mr. Tony Daniels. 

Introduction: Response to Dr. Neumann’s letter 

The points made in Dr. Neumann’s letter are, for the most part, a summary of the more 
detailed comments made in the document and therefore will not be addressed separately, 
with the exception of the following, which merit a direct response: 

1. Our report does not make “categorical pronouncements about model results 
without documenting any analytical foundation”; in fact, we provide numerous 
references to back our conclusion about bias and inconsistency. 

2. The database developed in support of model development is not representative of 
the traveler population; while the RP data set used comes from a random sample 
representative of the traveler population, the SP data set comes from a choice- 
based sample with significant over-sampling of air and rail travelers. This is 
amply documented in CS’s own reports, and is acknowledged in CS’s answers to 
our questions. Dr. Neumann’s assertion is in direct contradiction to these reports 
and answers. 

Issue 1: Division of Trips into Long and Short Trips 

Cambridge Systematics’ comment to this issue does not address our point. We did not 
question the use of market segmentation for other variables. But market segmentation of 
trips into long and short trips means that there are two different sets of parameters for 
short (less than 100 mile) and long (more than 100 miles) trips. Thus, travel forecasts 
will incur a sudden change as the trip distance increases from 99.9 miles to 100.1 miles, 
which is behaviorally unrealistic. 

The simple way to definitely answer this point is for Cambridge Systematics to perform 
forecasts for trip distances in the neighborhood of 100 miles and test whether there are 
abrupt changes. 

Issue 2: Assignment of all Business trips to the Peak Period 

Cambridge Systematics’ comment indicates that they misunderstood this point in our 
draft report. We did not dispute the fact that most business trips occur during the peak 
period in urban or intra-regional travel. But that is not the case in inter-regional 
travel. In California, over 25% of business trips occur in the off-peak (California Travel 
Survey)! Therefore, the procedure used by CS is incorrect and leads to introduction of 
measurement errors, and thus biased and inconsistent parameter estimates. 



Issue 3: Repeated SP choice treatment 

While we have great respect for Mark Bradley, he did not literally “write the book” on SP 
methods. The key book on this subject is Stated Choice Methods by Louviere, Hensher, 
and Swait (Cambridge University Press, 2000). Although there are very strong 
assumptions that lead to the parameter estimates still being consistent, it is trivial to show 
that the standard errors from maximum likelihood estimation assuming independence 
across repeated choice experiments will be downward biased. This leads to inflated t- 
statistics and makes it impossible to assess the precision of the models’ parameter 
estimates and forecasts. 

Once more realistic assumptions are made about the sources of correlations between 
repeated SP choices, then models estimated ignoring these correlations are inconsistent. 
Although there are examples where the magnitude of the inconsistency is small, there are 
also many examples where the biases are as large as 100 percent. 1 

Issue 4: Constraining the Headway Coefficient 

We do not address the question of waiting and terminal time in the report, but only the 
sensitivity to headways. We do, however, find it implausible that an air passenger would 
spend a total of 73 to 79 minutes at the airport when taking an intrastate flight, as is 
apparently assumed in the study. 

Regarding headway sensitivity, CS argues that HSR service “offers a new paradigm of 
interregional service” ... “comparable to the best urban rail services.” It is a matter of 
speculation whether, in this new paradigm, travelers will simply show up at rail stations 
and wait for the next available train, as the CS model implicitly assumes. However, it is 
highly implausible that air travelers will behave in this manner, as the model also 
assumes. 

CS notes that the headway coefficient for air service was adjusted based on results in the 
calibration step, which showed that the original model underestimated air trips in high 
frequency markets and overestimated them in low frequency markets. There are many 
ways that the model could be adjusted to correct this; we do not believe that the method 
chosen, which contradicts both common sense and empirical evidence, was the 
appropriate one. 

Finally, CS claims that the value of this coefficient does not make much difference 
anyway, since times associated with headways contribute little to overall travel times. 
However, the vast majority of the airport headways reported in the Level of Service 
Report (Tables 2.22 and 2.23) are over 30 minutes. This is not a small fraction of the 
overall travel times for air trips. Indeed, the importance of the assumed coefficient is 


1 See for example Hensher, D.A. (2001). The sensitivity of the valuation of travel time savings to the 
specification of the unobserved effects. Transportation Research E, 37, pp. 139 - 142. Also see 
Brownstone, D., D.S. Bunch, and K. Train (2000). Joint mixed logit models of stated and revealed 
preferences for alternative-fueled vehicles. Transportation Research B, 34, pp. 315-338. 



evident in the fact that CS was able to correct the under- and over-estimates of air trips by 
adjusting it. 


Issue 5: Absence of an Airport/Station-Choice Model 

We appreciate the efforts taken to assess the magnitude of the errors resulting from the 
failure to model station choice. Based on that analysis, we have acknowledged in the 
report that correctly modeling station choice is unlikely to totally eliminate the projected 
ridership difference between the Pacheco and Altamont alignments. 

We do, however, believe that the region for which travelers have a meaningful choice of 
high speed rail stations is larger than the CS analysis assumes. For example, the driving 
time from downtown San Jose to the Bernal/I-680 is 34 minutes in the off-peak and 45 
minutes in the off-peak, according to Google Maps. The high speed rail travel time for 
this trip would be 17 minutes, according to the Level of Service Report. Thus, in the off- 
peak, there is a 17-minute travel time penalty for a traveler from San Jose accessing the 
HSR service at Bernal/I-680 instead of San Jose. Travelers may well be willing to accept 
this penalty for a more convenient schedule, particularly given the high weight attached 
to service headway assumed in the model (see Issue 4 discussion). Station choice would 
be even more important for San Jose travelers if they could access the merged line at a 
point closer to San Jose, such as Fremont. 

Finally, we note that the level of service assumptions state (p. 2-29) that airport headways 
from San Francisco to Los Angeles “are assumed to be half the quoted headway because 
most travelers have more than one airport choice and therefore twice as many air trips to 
choose from.” Thus CS recognizes that airport choice affects service (albeit in an ad hoc 
way), but ignores this effect in the case of HSR station choice. 

Issue 6: Calibration of the Alternative-Specific Constants 

The problem we identified here is not the subject of an academic debate. The Bierlaire, 
Bolduc and McFadden paper clearly shows that the Koppehnan and Garrow procedure is 
wrong. Neither Koppehnan nor Garrow have disputed this, so there is no debate . While 
it is true that many applications of discrete choice models use choice-based samples with 
calibrated alternative-specific constants, this procedure is only correct in the case of 
MNL choice models. Even if this procedure is followed with Nested Logit models and is 
“standard practice,” that doesn’t make it legitimate. The only options for using Nested 
Logit models with endogenous samples are the Manski-Lennan WESMLE estimator, 
Imbens’ generalized method of moments estimator, and the more recent Bierlaire et. al. 
estimator. 

Issue 7: Constraining coefficients 

We are aware that it is sometimes necessary to reconcile conflicting results while 
calibrating models fit from imperfect or small samples, and we did not intend to suggest 
that Cambridge Systematics was deliberately trying to bias the results while dealing with 
this problem. Ideally multiple source of information should be combined in a disciplined 


2 

~ Imbens, G., 1992. An efficient method of moments estimator for discrete choice models with choice- 
based sampling. Econometrica 60 (5), pp. 1187-1214. 



way using either Generalized Method of Moments as in Imbens (footnote 2 below) or 
formal Bayesian analysis. The key problem of simply constraining coefficients is that it 
is very difficult to quantify the relative importance of “outside information” and that 
coming from the sample. In any case, constraining coefficients will always bias the 
standard errors of the remaining estimated coefficients. 



